DocumentCode
2117916
Title
TaxoLearn: A Semantic Approach to Domain Taxonomy Learning
Author
Dietz, E. ; Vandic, D. ; Frasincar, Flavius
Author_Institution
Econometric Inst., Erasmus Univ., Rotterdam, Netherlands
Volume
1
fYear
2012
fDate
4-7 Dec. 2012
Firstpage
58
Lastpage
65
Abstract
Building domain taxonomies is a crucial task in the domain of ontology construction. Domain taxonomy learning keeps getting more important as a form of automatically obtaining a knowledge representation of a certain domain. The alternative of manually developing domain taxonomies is not trivial. The main issues encountered when manually developing a taxonomy are the non-availability of a domain knowledge expert and the considerable amount of effort needed for this task. This paper proposes Taxo Learn, an approach to automatic construction of domain taxonomies. Taxo Learn is a new methodology that combines aspects from existing approaches, but also contains new steps in order to improve the quality of the resulted domain taxonomy. The contribution of this paper is threefold. First, we employ a word sense disambiguation step when detecting concepts in the text. Second, we show the use of semantics-based hierarchical clustering for the purpose of taxonomy learning. Third, we propose a novel dynamic labeling procedure for the concept clusters. We evaluate our approach by comparing the machine generated taxonomy with a manually constructed golden taxonomy. Based on a corpus of documents in the field of financial economics, Taxo Learn shows a high precision for the learned taxonomic concept relationships.
Keywords
financial management; learning (artificial intelligence); natural language processing; ontologies (artificial intelligence); pattern clustering; text analysis; TaxoLearn approach; automatic domain taxonomy construction; concept clusters; document corpus; domain knowledge expert nonavailability; domain taxonomy learning quality improvement; dynamic labeling procedure; financial economics; knowledge representation; machine generated taxonomy; manually constructed golden taxonomy; ontology construction domain; semantic-based hierarchical clustering; text concept detection; word sense disambiguation; concept learning; taxonomy learning; word sense disambiguation;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on
Conference_Location
Macau
Print_ISBN
978-1-4673-6057-9
Type
conf
DOI
10.1109/WI-IAT.2012.129
Filename
6511866
Link To Document