Title :
Comparison of the Effects of Morphological and Ontological Information on Text Categorization
Author :
Koirala, Cesar ; Rasheed, Khaled
Author_Institution :
Artificial Intell. Center, Univ. of Georgia, Atlanta, GA
Abstract :
In this paper we compare the effectiveness of using morphological and ontological information for text categorization. We induce morphological information using stemmed features. Ontological information, on the other hand, has been induced in the form of WordNet hypernyms. We form text representations based on stemming and hypernyms. Those representations are evaluated using four different machine learning algorithms on the Reuters 21578 dataset. We report average F1 measures as the results. The results show that stemming-based text representation gives better performance than hypernym-based text representation even though we used a novel hypernym formation approach. We also combine the stemming based representation with the hypernym based representation. The combined representation does not produce any significant improvement in performance. The results suggest that ontological information does not help in categorization tasks and is less effective than morphological information.
Keywords :
ontologies (artificial intelligence); text analysis; WordNet hypernyms; morphological information; ontological information; stemmed features; text categorization; Algorithm design and analysis; Animals; Artificial intelligence; Databases; Density measurement; Machine learning; Machine learning algorithms; Ontologies; Text categorization; Hypernyms; Machine Learning; Text Categorization; Wordnet;
Conference_Titel :
Machine Learning and Applications, 2008. ICMLA '08. Seventh International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
978-0-7695-3495-4
DOI :
10.1109/ICMLA.2008.113