Unsupervised Mining of Knowledge Gaps in Scientific Literature

Abstract : Literature Based Discovery (LBD) relies on the identification of gaps in the scientific literature. Most of the existing methods are supervised and rely on the use of specific large knowledge domain databases like MedLine for medical study. We present here a tractable approach based on Natural Language Processing techniques with few linguistic resources and Formal Concept Lattice exploration. Entities are automatically extracted from full text scientific papers based on their acronym forms. An unsupervised classification is build using syntax and WordNet relations. Resulting classes are clustered into multiple formal concepts and the knowledge gaps are identified in the resulting Galois Lattice. The feasibility and the relevance of the outcome is analyzed on a large corpus of fulltext journal articles dealing with nuclear energy research.
Complete list of metadatas

Cited literature [10 references]  Display  Hide  Download

https://hal-univ-avignon.archives-ouvertes.fr/hal-02152783
Contributor : Pierre Jourlin <>
Submitted on : Tuesday, June 11, 2019 - 4:46:08 PM
Last modification on : Tuesday, June 18, 2019 - 1:24:18 AM

File

jadt2010_final.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02152783, version 1

Collections

Citation

Silvia Fernandez¹, Pierre Jourlin, Eric Sanjuan². Unsupervised Mining of Knowledge Gaps in Scientific Literature. Journées d’Analyse statistique des Données Textuelles, Jun 2010, Rome, Italy. ⟨hal-02152783⟩

Share

Metrics

Record views

10

Files downloads

10