DocumentCode :
3337027
Title :
Information Extraction as an Ontology Population Task and Its Application to Genic Interactions
Author :
Manine, Alain-Pierre ; Alphonse, Erick ; Bessieres, P.
Author_Institution :
Lab. d´´Inf. Paris-Nord, Univ. Paris, Villetaneuse
Volume :
2
fYear :
2008
fDate :
3-5 Nov. 2008
Firstpage :
74
Lastpage :
81
Abstract :
Ontologies are a well-motivated formal representation to model knowledge needed to extract and encode data from text. Yet, their tight integration with Information Extraction (IE) systems is still a research issue, a fortiori with complex ones that go beyond hierarchies. In this paper, we introduce an original architecture where IE is specified by designing an ontology, and the extraction process is seen as an Ontology Population (OP) task. Concepts and relations of the ontology define a normalized text representation. As their abstraction level is irrelevant for text extraction, we introduced a Lexical Layer (LL) along with the ontology, i.e. relations and classes at an intermediate level of normalization between raw text and concepts. On the contrary to previous IE systems, the extraction process only involves normalizing the outputs of Natural Language Processing (NLP) modules with instances of the ontology and the LL. All the remaining reasoning is left to a query module, which uses the inference rules of the ontology to derive new instances by deduction. In this context, these inference rules subsume classical extraction rules or patterns by providing access to appropriate abstraction level and domain knowledge. To acquire those rules, we adopt an Ontology Learning (OL) perspective, and automatically acquire the inference rules with relational Machine Learning (ML). Our approach is validated on a genic interaction extraction task from a Bacillus subtilis bacterium text corpus. We reach a global recall of 89.3% and a precision of 89.6%, with high scores for the ten conceptual relations in the ontology.
Keywords :
biology computing; information retrieval; learning (artificial intelligence); natural language processing; ontologies (artificial intelligence); text analysis; genic interactions; information extraction; lexical layer; natural language processing; ontology population task; relational machine learning; text extraction; Artificial intelligence; Data mining; Databases; Knowledge representation; Machine learning; Natural language processing; Ontologies; Pipelines; Scattering; Thesauri; Genic Interactions; Inductive Logic Programming; Information Extraction; Ontology Learning; Ontology Population;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Tools with Artificial Intelligence, 2008. ICTAI '08. 20th IEEE International Conference on
Conference_Location :
Dayton, OH
ISSN :
1082-3409
Print_ISBN :
978-0-7695-3440-4
Type :
conf
DOI :
10.1109/ICTAI.2008.117
Filename :
4669758
Link To Document :
بازگشت