DocumentCode
466109
Title
Scaling Text Classification with Relevance Vector Machines
Author
Silva, Catarina ; Ribeiro, Bernardete
Author_Institution
Polytech. Inst. of Leiria, Leiria
Volume
5
fYear
2006
fDate
8-11 Oct. 2006
Firstpage
4186
Lastpage
4191
Abstract
Text classification (TC) is a complex ubiquitous task that handles a huge amount of data. Current research has recently proved that kernel learning based methods are quite effective in this problem. As opposed to support vector machines (SVM), the relevance vector machine (RVM) in particular yields a probabilistic output while preserving its accuracy. However, few research efforts have addressed the issue of scalability that arises when applying RVM to large scale problems like TC. We propose a new model which consists of a two-step RVM classifier able to (i) be competitive regarding processing time, (ii) use all available training elements and (iii) improve RVM classification performance. The paper also shows that a convenient similitude measure among documents can be defined on all the collection data, which does not only make the process swifter but also parallelizable. Using REUTERS-21578, we show that deployment of successful real-time applications is possible through reduction of the computational complexity and improvement of overall performance, obtained by the proposed model.
Keywords
classification; computational complexity; probability; support vector machines; text analysis; computational complexity; kernel learning based method; probability; relevance vector machine; support vector machine; text classification; Bayesian methods; Cybernetics; Frequency conversion; Informatics; Kernel; Large-scale systems; Scalability; Support vector machine classification; Support vector machines; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2006. SMC '06. IEEE International Conference on
Conference_Location
Taipei
Print_ISBN
1-4244-0099-6
Electronic_ISBN
1-4244-0100-3
Type
conf
DOI
10.1109/ICSMC.2006.384791
Filename
4274556
Link To Document