Title :
Automated text categorization by generalized kernel machines
Author :
Jiaqi Tan ; Wenye Li ; Haoming Li
Author_Institution :
Macao Polytech. Inst., Macao, China
Abstract :
Text categorization refers to the task of designing methods to automatically classify text documents into different groups. With wide applications in intelligent information processing, it has attracted much recent research attention. The classical support vector machines (SVM) algorithm has obtained significant success on this task. Inspired by the achievements of SVM, a family of related kernel methods is being widely studied. This paper investigates a novel kernel method for text categorization. Different from SVM and related approaches which operate on thousands of word term features of text corpus, our method takes concept features into consideration as well. We use a generalized regularizer which leaves concept features unregularized. With a squared-loss function to measure the empirical error, the method has a simple convex solution. In real evaluations we have verified both the effectiveness and the efficiency of the method in benchmarked text categorization applications.
Keywords :
pattern classification; support vector machines; text analysis; SVM algorithm; automated text categorization; benchmarked text categorization application; convex solution; empirical error; generalized kernel machines; intelligent information processing; kernel method; squared-loss function; support vector machines algorithm; text corpus; text document classification; Accuracy; Complexity theory; Computers; Kernel; Support vector machines; Text categorization; Training; Generalized Kernel Method; Regularized Least-Squares Classification; Support Vector Machines; Text Categorization;
Conference_Titel :
Information and Automation (ICIA), 2014 IEEE International Conference on
Conference_Location :
Hailar
DOI :
10.1109/ICInfA.2014.6932685