Title :
Using category-based semantic field for text categorization
Author :
Wang, Qiang ; Guan, Yi ; Wang, Xiao-long ; Xu, Zhi-Ming
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., China
Abstract :
This paper proposes a new document representation method to text categorization. It applies category-based semantic field (CBSF) theory for text categorization to gain a more efficient representation of documents. The lexical chain is introduced to compute CBSF and Hownet* used as a lexical database. In particular, the title of each document functions as a clue to forecast the potential CBSF of the test document. Combined with classifier, this approach is examined in text categorization and the result indicates that it performs better than conventional methods with features chosen on the basis of bag-of-words (BOW) system, on the same task.
Keywords :
classification; text analysis; CBSF theory; Hownet; SVM; bag-of-words system; category-based semantic field; document representation method; lexical chain; lexical database; text categorization; Computational complexity; Computer science; Information retrieval; Machine learning; Spatial databases; Statistical learning; Support vector machine classification; Support vector machines; Testing; Text categorization; Category-based Semantic Field (CBSF); Hownet; Lexical Chain; SVM;
Conference_Titel :
Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on
Conference_Location :
Guangzhou, China
Print_ISBN :
0-7803-9091-1
DOI :
10.1109/ICMLC.2005.1527598