• DocumentCode
    2706352
  • Title

    A fast SVM training method for very large datasets

  • Author

    Li, Boyang ; Wang, Qiangwei ; Hu, Jinglu

  • Author_Institution
    Grad. Sch. of Inf., Production & Syst., Waseda Univ., Kitakyushu, Japan
  • fYear
    2009
  • fDate
    14-19 June 2009
  • Firstpage
    1784
  • Lastpage
    1789
  • Abstract
    In a standard support vector machine (SVM), the training process has O(n3) time and O(n2) space complexities, where n is the size of training dataset. Thus, it is computationally infeasible for very large datasets. Reducing the size of training dataset is naturally considered to solve this problem. SVM classifiers depend on only support vectors (SVs) that lie close to the separation boundary. Therefore, we need to reserve the samples that are likely to be SVs. In this paper, we propose a method based on the edge detection technique to detect these samples. To preserve the entire distribution properties, we also use a clustering algorithm such as K-means to calculate the centroids of clusters. The samples selected by edge detector and the centroids of clusters are used to reconstruct the training dataset. The reconstructed training dataset with a smaller size makes the training process much faster, but without degrading the classification accuracies.
  • Keywords
    computational complexity; edge detection; pattern clustering; support vector machines; very large databases; K-means; SVM classifiers; classification accuracies; clustering algorithm; edge detection technique; fast SVM training method; space complexities; support vector machine; time complexities; training dataset; training process; very large datasets; Clustering algorithms; Detectors; Image edge detection; Kernel; Matrix decomposition; Quadratic programming; Sampling methods; Support vector machine classification; Support vector machines; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2009. IJCNN 2009. International Joint Conference on
  • Conference_Location
    Atlanta, GA
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-3548-7
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2009.5178618
  • Filename
    5178618