• DocumentCode
    2963752
  • Title

    Large-scale patent classification with min-max modular support vector machines

  • Author

    Chu, Xiao-Lei ; Ma, Chao ; Li, Jing ; Lu, Bao-Liang ; Utiyama, Masao ; Isahara, Hitoshi

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai
  • fYear
    2008
  • fDate
    1-8 June 2008
  • Firstpage
    3973
  • Lastpage
    3980
  • Abstract
    Patent classification is a large-scale, hierarchical, imbalanced, multi-label problem. The number of samples in a real-world patent classification typically exceeds one million, and this number increases every year. An effective patent classifier must be able to deal with this situation. This paper discusses the use of min-max modular support vector machine (M3-SVM) to deal with large-scale patent classification problems. The method includes three steps: decomposing a large-scale and imbalanced patent classification problem into a group of relatively smaller and more balanced two-class subproblems which are independent of each other, learning these subproblems using support vector machines (SVMs) in parallel, and combining all of the trained SVMs according to the minimization and the maximization rules. M3-SVM has two attractive features which are urgently needed to deal with large-scale patent classification problems. First, it can be realized in a massively parallel form. Second, it can be built up incrementally. Results from experiments using the NTCIR-5 patent data set, which contains more than two million patents, have confirmed these two attractive features, and demonstrate that M3-SVM outperforms conventional SVMs in terms of both training time and generalization performance.
  • Keywords
    optimisation; patents; pattern classification; support vector machines; maximization rules; min-max modular support vector machines; minimization rules; multilabel problem; patent classification; patent classifier; Chaos; Databases; Humans; Large-scale systems; Machine learning; Neural networks; Pattern classification; Scalability; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1820-6
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2008.4634369
  • Filename
    4634369