Title :
A new method for attribute extraction with application on text classification
Author :
G?ksel Biricik;Banu Diri;Ahmet Co?kun S?nmez
Author_Institution :
Computer Engineering Department, Yildiz Technical University, ?stanbul, Turkey
Abstract :
We introduce a new method for dimensionality reduction by attribute extraction and evaluate its impact on text classification. The textual contents in body sections of the news in Reuters-21758 are the selected attributes for classification. Using the offered method, high dimension of attributes- words extracted from the news bodies- are projected onto a new hyper plane having dimensions equal to the number of classes. Results show that processing times of classification algorithms dramatically decrease with the attribute extraction method we offer. This is achieved by the fall of the number of attributes given to classifiers. Accuracies of the classification algorithms also increase compared to tests run without using the proposed method.
Keywords :
"Text categorization","Data mining","Classification algorithms","Testing","Principal component analysis","Information retrieval","Frequency","Filters","Application software","Indexing"
Conference_Titel :
Soft Computing, Computing with Words and Perceptions in System Analysis, Decision and Control, 2009. ICSCCW 2009. Fifth International Conference on
Print_ISBN :
978-1-4244-3429-9
DOI :
10.1109/ICSCCW.2009.5379479