DocumentCode :
1631837
Title :
Term Weighting Approaches for Text Categorization Improving
Author :
Matsunaga, L.A. ; Ebecken, N.F.F.
Author_Institution :
Fed. District Legislative Assembly
Volume :
1
fYear :
2008
Firstpage :
409
Lastpage :
414
Abstract :
The objective of the text categorization problem examined in this paper corresponds to automatically distribute the legislative bills to the committees at the Federal District Legislative Assembly in Brasilia, Brazil. For this study the replacement of the idf part in TFIDF by a new term selection measure - absl logit- and by bi-normal separation produced the best general classification results, using support vector machines models (SVM), when compared with TFIDF and with the use of common term selection measures - chi-square, information gain, gain ratio and odds ratio - to replace the idf part in TFIDF.
Keywords :
category theory; support vector machines; text analysis; support vector machines models; term selection measures; term weighting; text categorization; Assembly systems; Dictionaries; Frequency; Gain measurement; Intelligent systems; Support vector machine classification; Support vector machines; Text categorization; Text mining; Vocabulary; term weighting; text; text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-0-7695-3382-7
Type :
conf
DOI :
10.1109/ISDA.2008.21
Filename :
4696241
Link To Document :
بازگشت