• DocumentCode
    466109
  • Title

    Scaling Text Classification with Relevance Vector Machines

  • Author

    Silva, Catarina ; Ribeiro, Bernardete

  • Author_Institution
    Polytech. Inst. of Leiria, Leiria
  • Volume
    5
  • fYear
    2006
  • fDate
    8-11 Oct. 2006
  • Firstpage
    4186
  • Lastpage
    4191
  • Abstract
    Text classification (TC) is a complex ubiquitous task that handles a huge amount of data. Current research has recently proved that kernel learning based methods are quite effective in this problem. As opposed to support vector machines (SVM), the relevance vector machine (RVM) in particular yields a probabilistic output while preserving its accuracy. However, few research efforts have addressed the issue of scalability that arises when applying RVM to large scale problems like TC. We propose a new model which consists of a two-step RVM classifier able to (i) be competitive regarding processing time, (ii) use all available training elements and (iii) improve RVM classification performance. The paper also shows that a convenient similitude measure among documents can be defined on all the collection data, which does not only make the process swifter but also parallelizable. Using REUTERS-21578, we show that deployment of successful real-time applications is possible through reduction of the computational complexity and improvement of overall performance, obtained by the proposed model.
  • Keywords
    classification; computational complexity; probability; support vector machines; text analysis; computational complexity; kernel learning based method; probability; relevance vector machine; support vector machine; text classification; Bayesian methods; Cybernetics; Frequency conversion; Informatics; Kernel; Large-scale systems; Scalability; Support vector machine classification; Support vector machines; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2006. SMC '06. IEEE International Conference on
  • Conference_Location
    Taipei
  • Print_ISBN
    1-4244-0099-6
  • Electronic_ISBN
    1-4244-0100-3
  • Type

    conf

  • DOI
    10.1109/ICSMC.2006.384791
  • Filename
    4274556