DocumentCode :
2021229
Title :
An Improved Algorithm for Multiclass Text Categorization with Support Vector Machine
Author :
Shao, Fubo ; He, Guoping ; Zhang, Xin
Author_Institution :
Coll. of Inf. Sci. & Eng., Shandong Univ. of Sci. & Technol., Qingdao
Volume :
1
fYear :
2008
fDate :
17-18 Oct. 2008
Firstpage :
336
Lastpage :
339
Abstract :
Automated text categorization is attractive because it frees organizations from the need of manually organizing document bases. Support Vector Machine (SVM) is an efficient technique for text categorization. Computing kernel matrix is the key in text categorization with SVM. When the kind of texts is large, the matrix of texts will become sparse. If we compute the kernel matrix directly, it will waste much time and memory space. To save time, the paper explored the hash function in the process of computing the kernel matrix. Then we propose an improved algorithm for multiclass text categorization. The paper also gives the good property of the improved algorithm from the theoretical and experimental aspects. We compared the improved algorithm with the original algorithm. Experiment shows that the improved algorithm can save much computational time.
Keywords :
file organisation; support vector machines; text analysis; hash function; kernel matrix; multiclass text categorization; support vector machine; Algorithm design and analysis; Computational intelligence; Frequency; Helium; Kernel; Organizing; Sparse matrices; Support vector machine classification; Support vector machines; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Design, 2008. ISCID '08. International Symposium on
Conference_Location :
Wuhan
Print_ISBN :
978-0-7695-3311-7
Type :
conf
DOI :
10.1109/ISCID.2008.152
Filename :
4725621
Link To Document :
بازگشت