Title :
Textual data categorization: back to the phrase-based representation
Author :
Katrenko, Sophia
Author_Institution :
Fac. of Comput. Sci., Lviv Polytech. Nat. Univ., Ukraine
Abstract :
This paper primarily focuses on applying and evaluation of phrase-based representation used while classifying documents. This issue has been discussed over last decades but unfortunately not in all cases the usage of it improved accuracy of existing systems. We try to give an explanation for this and to carry out some experiments aiming at improving document categorization results.
Keywords :
classification; document handling; text analysis; document classification; machine learning; phrase-based representation; statistical measures; textual data categorization; Computer science; Dictionaries; Electronic mail; Machine learning; Statistical analysis; Testing; Text categorization;
Conference_Titel :
Intelligent Systems, 2004. Proceedings. 2004 2nd International IEEE Conference
Print_ISBN :
0-7803-8278-1
DOI :
10.1109/IS.2004.1344853