• DocumentCode
    1849598
  • Title

    Experimental analysis of new algorithms for learning ternary classifiers

  • Author

    Zucker, Jean-Daniel ; Chevaleyre, Yann ; Van Sang, Dao

  • Author_Institution
    IRD France Nord, UMMISCO, Bondy, France
  • fYear
    2015
  • fDate
    25-28 Jan. 2015
  • Firstpage
    19
  • Lastpage
    24
  • Abstract
    Discrete linear classifier is a very sparse class of decision model that has proved useful to reduce overfitting in very high dimension learning problems. However, learning discrete linear classifier is known as a difficult problem. It requires finding a discrete linear model minimizing the classification error over a given sample. A ternary classifier is a classifier defined by a pair (w, r) where w is a vector in {-1, 0, +1}n and r is a nonnegative real capturing the threshold or offset. The goal of the learning algorithm is to find a vector of weights in {-1, 0, +1}n that minimizes the hinge loss of the linear model from the training data. This problem is NP-hard and one approach consists in exactly solving the relaxed continuous problem and to heuristically derive discrete solutions. A recent paper by the authors has introduced a randomized rounding algorithm [1] and we propose in this paper more sophisticated algorithms that improve the generalization error. These algorithms are presented and their performances are experimentally analyzed. Our results show that this kind of compact model can address the complex problem of learning predictors from bioinformatics data such as metagenomics ones where the size of samples is much smaller than the number of attributes. The new algorithms presented improve the state of the art algorithm to learn ternary classifier. The source of power of this improvement is done at the expense of time complexity.
  • Keywords
    bioinformatics; computational complexity; generalisation (artificial intelligence); learning (artificial intelligence); pattern classification; vectors; NP-hard; bioinformatics data; classification error minimization; decision model; discrete linear classifier learning; generalization error; metagenomics; randomized rounding algorithm; ternary classifier learning; time complexity; vector; Algorithm design and analysis; Classification algorithms; Data models; Error analysis; Fasteners; Prediction algorithms; Vectors; Metagenomics data; Randomized Rounding; Ternary Classifier;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing & Communication Technologies - Research, Innovation, and Vision for the Future (RIVF), 2015 IEEE RIVF International Conference on
  • Conference_Location
    Can Tho
  • Print_ISBN
    978-1-4799-8043-7
  • Type

    conf

  • DOI
    10.1109/RIVF.2015.7049868
  • Filename
    7049868