DocumentCode
130026
Title
Automated text categorization by generalized kernel machines
Author
Jiaqi Tan ; Wenye Li ; Haoming Li
Author_Institution
Macao Polytech. Inst., Macao, China
fYear
2014
fDate
28-30 July 2014
Firstpage
376
Lastpage
381
Abstract
Text categorization refers to the task of designing methods to automatically classify text documents into different groups. With wide applications in intelligent information processing, it has attracted much recent research attention. The classical support vector machines (SVM) algorithm has obtained significant success on this task. Inspired by the achievements of SVM, a family of related kernel methods is being widely studied. This paper investigates a novel kernel method for text categorization. Different from SVM and related approaches which operate on thousands of word term features of text corpus, our method takes concept features into consideration as well. We use a generalized regularizer which leaves concept features unregularized. With a squared-loss function to measure the empirical error, the method has a simple convex solution. In real evaluations we have verified both the effectiveness and the efficiency of the method in benchmarked text categorization applications.
Keywords
pattern classification; support vector machines; text analysis; SVM algorithm; automated text categorization; benchmarked text categorization application; convex solution; empirical error; generalized kernel machines; intelligent information processing; kernel method; squared-loss function; support vector machines algorithm; text corpus; text document classification; Accuracy; Complexity theory; Computers; Kernel; Support vector machines; Text categorization; Training; Generalized Kernel Method; Regularized Least-Squares Classification; Support Vector Machines; Text Categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Information and Automation (ICIA), 2014 IEEE International Conference on
Conference_Location
Hailar
Type
conf
DOI
10.1109/ICInfA.2014.6932685
Filename
6932685
Link To Document