Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, EpiSciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Conference papers

Unsupervised Mining of Knowledge Gaps in Scientific Literature

Abstract : Literature Based Discovery (LBD) relies on the identification of gaps in the scientific literature. Most of the existing methods are supervised and rely on the use of specific large knowledge domain databases like MedLine for medical study. We present here a tractable approach based on Natural Language Processing techniques with few linguistic resources and Formal Concept Lattice exploration. Entities are automatically extracted from full text scientific papers based on their acronym forms. An unsupervised classification is build using syntax and WordNet relations. Resulting classes are clustered into multiple formal concepts and the knowledge gaps are identified in the resulting Galois Lattice. The feasibility and the relevance of the outcome is analyzed on a large corpus of fulltext journal articles dealing with nuclear energy research.
Complete list of metadata

Cited literature [10 references]  Display  Hide  Download
Contributor : pierre jourlin Connect in order to contact the contributor
Submitted on : Tuesday, June 11, 2019 - 4:46:08 PM
Last modification on : Wednesday, June 16, 2021 - 6:14:01 PM


Files produced by the author(s)


  • HAL Id : hal-02152783, version 1



Silvia Fernandez¹, Pierre Jourlin, Eric Sanjuan². Unsupervised Mining of Knowledge Gaps in Scientific Literature. Journées d’Analyse statistique des Données Textuelles, Jun 2010, Rome, Italy. ⟨hal-02152783⟩



Record views


Files downloads