DocumentCode :
3513297
Title :
A Novel Text Representation Model for Text Classification
Author :
Wang, Jun ; Zhou, Yiming
Author_Institution :
Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing
fYear :
2008
fDate :
1-3 Nov. 2008
Firstpage :
702
Lastpage :
705
Abstract :
The text representation in text classification is usually a sequence of terms. As the number of terms becomes very high, it is greatly time-consuming to perform existed text categorization tasks. In this paper we presented a novel text representation model for text classification which greatly reduced the required resources. This model represents text with several features. Each feature corresponds to a theme that emerged from a set of related articles. We also introduce an efficient way to build the model. The proposed model has been applied to naive bayes classifier and experiments on Reuters-21578 corpus have shown that the efficiency is greatly improved without sacrificing classification accuracy even when the dimension of the input space is significantly reduced.
Keywords :
Bayes methods; classification; text analysis; Reuters-21578 corpus; classification accuracy; naive Bayes classifier; text categorization tasks; text classification; text representation model; Clustering algorithms; Computer science; Indexing; Information retrieval; Intelligent networks; Intelligent systems; Natural language processing; Support vector machine classification; Support vector machines; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Networks and Intelligent Systems, 2008. ICINIS '08. First International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-3391-9
Electronic_ISBN :
978-0-7695-3391-9
Type :
conf
DOI :
10.1109/ICINIS.2008.21
Filename :
4683322
Link To Document :
بازگشت