• DocumentCode
    3165616
  • Title

    Improving classification with Support Vector Machine for unbalanced data

  • Author

    Muntean, M. ; Vãlean, H. ; Ileanã, I. ; Rotar, C.

  • Author_Institution
    1 Decembrie 1918 Univ. of Alba Iulia, Alba Iulia, Romania
  • Volume
    3
  • fYear
    2010
  • fDate
    28-30 May 2010
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    A problem arises in data mining, when classifying unbalanced datasets using Support Vector Machines. Because of the uneven distribution and the soft margin of the classifier, the algorithm tries to improve the general accuracy of classifying a dataset, and in this process it might misclassify a lot of weakly represented classes, confusing their class instances as overshoot values that appear in the dataset, and thus ignoring them. This paper introduces the Enhancer, a new algorithm that improves the Cost-sensitive classification for Support Vector Machines, by multiplying in the training step the instances of the underrepresented classes. We have discovered that by oversampling the instances of the class of interest, we are helping the Support Vector Machine algorithm to overcome the soft margin. As an effect, it classifies better future instances of this class of interest.
  • Keywords
    data mining; pattern classification; support vector machines; cost-sensitive classification; data mining; enhancer algorithm; pattern classification; soft margin; support vector machine; unbalanced dataset; Costs; Data mining; Learning systems; Machine learning; Pattern recognition; Probability; Sampling methods; Statistical learning; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automation Quality and Testing Robotics (AQTR), 2010 IEEE International Conference on
  • Conference_Location
    Cluj-Napoca
  • Print_ISBN
    978-1-4244-6724-2
  • Type

    conf

  • DOI
    10.1109/AQTR.2010.5520736
  • Filename
    5520736