• DocumentCode
    624900
  • Title

    A binarization strategy for modelling mixed data in multigroup classification

  • Author

    Masmoudi, Youssef ; Turkay, Metin ; Chabchoub, Habib

  • Author_Institution
    High Sch. of Commerce, Univ. of Sfax, Sfax, Tunisia
  • fYear
    2013
  • fDate
    29-31 May 2013
  • Firstpage
    347
  • Lastpage
    353
  • Abstract
    This paper presents a binarization pre-processing strategy for mixed datasets. We propose that the use of binary attributes for representing nominal and integer data is beneficial for classification accuracy. We also describe a procedure to convert integer and nominal data into binary attributes. Expectation- Maximization (EM) clustering algorithms was applied to classify the values of the attributes with a wide range to use a small number of binary attributes. Once the data set is pre-processed, we use the Support Vector Machine (LibSVM) for classification. The proposed method was tested on datasets from the literature. We demonstrate the improved accuracy and efficiency of presented binarization strategy for modelling mixed and complex data in comparison to the classification of the original dataset, nominal dataset and binary dataset.
  • Keywords
    expectation-maximisation algorithm; pattern classification; pattern clustering; support vector machines; EM clustering algorithm; LibSVM; binarization preprocessing strategy; binary attributes; binary dataset; classification accuracy; expectation-maximization clustering algorithm; integer data representation; mixed dataset modelling; multigroup classification; nominal data representation; nominal dataset; support vector machine; Accuracy; Clustering algorithms; Data models; Lenses; Servomotors; Support vector machines; Vehicles; Classification; Clustering of Attribute Values; Expectation-Maximization Algorithm (EM); Feature Binarization; Pre-processing Data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Logistics and Transport (ICALT), 2013 International Conference on
  • Conference_Location
    Sousse
  • Print_ISBN
    978-1-4799-0314-6
  • Type

    conf

  • DOI
    10.1109/ICAdLT.2013.6568483
  • Filename
    6568483