• DocumentCode
    586361
  • Title

    Region based Support Vector Machine algorithm for medical diagnosis on Pima Indian Diabetes dataset

  • Author

    Karatsiolis, S. ; Schizas, Christos N.

  • Author_Institution
    Comput. Sci. Dept., Univ. of Cyprus, Nicosia, Cyprus
  • fYear
    2012
  • fDate
    11-13 Nov. 2012
  • Firstpage
    139
  • Lastpage
    144
  • Abstract
    The problem of diagnosing Pima Indian Diabetes from data obtained from the UCI Repository of Machine Learning Databases[6] is handled with a modified Support Vector Machine strategy. Performance comparison with previous studies is presented in order to demonstrate the proposed algorithm´s advantages over various classification methods. The goal of the paper is to provide the grasp of a methodology that can be efficiently used to raise classification success rates obtained by the use of conventional approaches such as Neural Networks, RBF networks and K-nearest neighbors. The suggested algorithm divides the training set into two subsets: one that arises from the joining of coherent data regions and one that comprises of the data portion that is difficult to be clustered. Consequently, the first subset is used to train a Support Vector Machine with a RBF kernel and the second subset is used to train another Support Vector Machine with a polynomial kernel. During classification the algorithm is capable of identifying which of the two Support Vector Machine models to use. The intuition behind the suggested algorithm relies on the expectation that the RBF Support Vector Machine model is more appropriate to use on data sets of different characteristics than the polynomial kernel. In the specific study case the suggested algorithm raised average classification success rate to 82.2% while the best performance obtained by previous studies was 81% given by a fine tuned highly complex ARTMAP-IC model.
  • Keywords
    matrix algebra; medical diagnostic computing; pattern classification; polynomials; radial basis function networks; support vector machines; K-nearest neighbors; Pima Indian diabetes dataset; RBF kernel; RBF networks; RBF support vector machine model; UCI repository; classification methods; classification success rates; coherent data regions; data portion; machine learning databases; medical diagnosis; neural networks; polynomial kernel; region based support vector machine algorithm; Classification algorithms; Clustering algorithms; Data models; Kernel; Polynomials; Support vector machines; Training; Clustering; Pima Indian Diabetes; Support Vector Machine; Support Vector Machine Kernel;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics & Bioengineering (BIBE), 2012 IEEE 12th International Conference on
  • Conference_Location
    Larnaca
  • Print_ISBN
    978-1-4673-4357-2
  • Type

    conf

  • DOI
    10.1109/BIBE.2012.6399663
  • Filename
    6399663