Title :
Using χ2 as a feature selection method to improve the performance of a Contextual Entropy classifier
Author :
Mois?s Garc?a Villanueva;Edgar Ch?vez Gonz?lez;Leonardo Romero Mu?oz
Author_Institution :
Faculty of Electrical Engineering, UMSNH, Michoac?n, M?xico
Abstract :
Text categorization is the problem of classifying text documents into a set of predefined classes. This paper presents the application of the feature selection method χ2 to improve the accuracy of a Contextual Entropy (CE) classifier on the text categorization task. A given class of document is typically represented by large sparse vectors, after preprocessing the documents in that class. The χ2 method selects only the most important features to represent appropriately the class. This reduction on the number of features for each class improves the performance of text classifiers on large collections of documents, they are faster and require less memory. Using two well known datasets, the χ2 feature selection method in a CE classifier improves the accuracy of results.
Keywords :
"Entropy","Training","Text categorization","Context","Electronic mail","Vocabulary","Time-frequency analysis"
Conference_Titel :
Power, Electronics and Computing (ROPEC), 2015 IEEE International Autumn Meeting on
DOI :
10.1109/ROPEC.2015.7395127