Title :
An Improved Algorithm for Multiclass Text Categorization with Support Vector Machine
Author :
Shao, Fubo ; He, Guoping ; Zhang, Xin
Author_Institution :
Coll. of Inf. Sci. & Eng., Shandong Univ. of Sci. & Technol., Qingdao
Abstract :
Automated text categorization is attractive because it frees organizations from the need of manually organizing document bases. Support Vector Machine (SVM) is an efficient technique for text categorization. Computing kernel matrix is the key in text categorization with SVM. When the kind of texts is large, the matrix of texts will become sparse. If we compute the kernel matrix directly, it will waste much time and memory space. To save time, the paper explored the hash function in the process of computing the kernel matrix. Then we propose an improved algorithm for multiclass text categorization. The paper also gives the good property of the improved algorithm from the theoretical and experimental aspects. We compared the improved algorithm with the original algorithm. Experiment shows that the improved algorithm can save much computational time.
Keywords :
file organisation; support vector machines; text analysis; hash function; kernel matrix; multiclass text categorization; support vector machine; Algorithm design and analysis; Computational intelligence; Frequency; Helium; Kernel; Organizing; Sparse matrices; Support vector machine classification; Support vector machines; Text categorization;
Conference_Titel :
Computational Intelligence and Design, 2008. ISCID '08. International Symposium on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-3311-7
DOI :
10.1109/ISCID.2008.152