• DocumentCode
    2117916
  • Title

    TaxoLearn: A Semantic Approach to Domain Taxonomy Learning

  • Author

    Dietz, E. ; Vandic, D. ; Frasincar, Flavius

  • Author_Institution
    Econometric Inst., Erasmus Univ., Rotterdam, Netherlands
  • Volume
    1
  • fYear
    2012
  • fDate
    4-7 Dec. 2012
  • Firstpage
    58
  • Lastpage
    65
  • Abstract
    Building domain taxonomies is a crucial task in the domain of ontology construction. Domain taxonomy learning keeps getting more important as a form of automatically obtaining a knowledge representation of a certain domain. The alternative of manually developing domain taxonomies is not trivial. The main issues encountered when manually developing a taxonomy are the non-availability of a domain knowledge expert and the considerable amount of effort needed for this task. This paper proposes Taxo Learn, an approach to automatic construction of domain taxonomies. Taxo Learn is a new methodology that combines aspects from existing approaches, but also contains new steps in order to improve the quality of the resulted domain taxonomy. The contribution of this paper is threefold. First, we employ a word sense disambiguation step when detecting concepts in the text. Second, we show the use of semantics-based hierarchical clustering for the purpose of taxonomy learning. Third, we propose a novel dynamic labeling procedure for the concept clusters. We evaluate our approach by comparing the machine generated taxonomy with a manually constructed golden taxonomy. Based on a corpus of documents in the field of financial economics, Taxo Learn shows a high precision for the learned taxonomic concept relationships.
  • Keywords
    financial management; learning (artificial intelligence); natural language processing; ontologies (artificial intelligence); pattern clustering; text analysis; TaxoLearn approach; automatic domain taxonomy construction; concept clusters; document corpus; domain knowledge expert nonavailability; domain taxonomy learning quality improvement; dynamic labeling procedure; financial economics; knowledge representation; machine generated taxonomy; manually constructed golden taxonomy; ontology construction domain; semantic-based hierarchical clustering; text concept detection; word sense disambiguation; concept learning; taxonomy learning; word sense disambiguation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on
  • Conference_Location
    Macau
  • Print_ISBN
    978-1-4673-6057-9
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2012.129
  • Filename
    6511866