• DocumentCode
    2708247
  • Title

    GNG-SVM framework - classifying large datasets with Support Vector Machines using Growing Neural Gas

  • Author

    Linda, Ondrej ; Manic, Milos

  • fYear
    2009
  • fDate
    14-19 June 2009
  • Firstpage
    1820
  • Lastpage
    1826
  • Abstract
    Support vector machines (SVMs) represent a well known technique for data classification. However, the complexity of the training process makes the SVMs unsuitable for classifying large datasets. Examples of existing approaches to this problem are sampling of the input datasets or clustering of similar inputs. On the other hand, the growing neural gas algorithm (GNG) is a robust tool for cluster analysis, capable of learning the topology of the data. It overcomes most of the common issues of clustering techniques such as predefined number of clusters or beforehand specified cluster radius. This paper presents a solution to the problem of classifying large datasets via learning of the data topology. The described algorithm combines the GNG algorithm with the SVM solver into a specific algorithm for classification of large datasets - the GNG-SVM framework. The input dataset is first preprocessed with the GNG algorithm. A new reduced training dataset is created from the extracted topological knowledge. Because the size of the dataset is significantly reduced, the training process of the SVM solver becomes substantially less memory demanding. The performance of the proposed GNG-SVM framework is tested on both synthetic and benchmark real world datasets.
  • Keywords
    knowledge acquisition; learning (artificial intelligence); pattern classification; pattern clustering; support vector machines; GNG-SVM framework; data topology; growing neural gas; knowledge extraction; large dataset classification; machine learning; pattern clustering; support vector machine; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Data mining; Data preprocessing; Robustness; Sampling methods; Support vector machine classification; Support vector machines; Topology; Data Classification; Growing Neural Gas; Large Datasets; Support Vector Machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2009. IJCNN 2009. International Joint Conference on
  • Conference_Location
    Atlanta, GA
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-3548-7
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2009.5178713
  • Filename
    5178713