DocumentCode
3529188
Title
Improved Single-Label Text Categorization by Instance Filtration
Author
Khan, Kashif Ullah ; Qamar, Usman
Author_Institution
Dept. of Comput. Eng., Nat. Univ. of Sci. & Technol. (NUST), Islamabad, Pakistan
fYear
2015
fDate
8-10 July 2015
Firstpage
28
Lastpage
35
Abstract
Machine learning classifiers are widely used for text categorization however a classifier misclassifies some of the instances into a category that is relevant to their actual category. The categorization ability of a classifier can be improved by filtering dataset with better classifier and removing such category for misclassified instances. In this paper we proposed a two level approach where level-1 filters instances according to their likelihood in each category and reduce training dataset to top ranked ´t´ categories and their instances whereas level-2 classifier is used to classify instances with filtered training set. We employed Naïve Bayes, SVM and KNN as machine learning classifiers. Experimental evaluations on standard reuters-21578, cade12 and 20 Newsgroups datasets showed improved categorization effectiveness as measured by accuracy, precision, recall and f-measure protocols.
Keywords
Bayes methods; learning (artificial intelligence); pattern classification; support vector machines; text analysis; KNN; Naïve Bayes; SVM; categorization ability; categorization effectiveness; filtering dataset; instance filtration; machine learning classifier; single-label text categorization; Accuracy; Computational modeling; Filtration; Standards; Support vector machines; Text categorization; Training; KNN; Naïve Bayes; SVM; text classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Complex, Intelligent, and Software Intensive Systems (CISIS), 2015 Ninth International Conference on
Conference_Location
Blumenau
Print_ISBN
978-1-4799-8869-3
Type
conf
DOI
10.1109/CISIS.2015.10
Filename
7185162
Link To Document