DocumentCode
1849598
Title
Experimental analysis of new algorithms for learning ternary classifiers
Author
Zucker, Jean-Daniel ; Chevaleyre, Yann ; Van Sang, Dao
Author_Institution
IRD France Nord, UMMISCO, Bondy, France
fYear
2015
fDate
25-28 Jan. 2015
Firstpage
19
Lastpage
24
Abstract
Discrete linear classifier is a very sparse class of decision model that has proved useful to reduce overfitting in very high dimension learning problems. However, learning discrete linear classifier is known as a difficult problem. It requires finding a discrete linear model minimizing the classification error over a given sample. A ternary classifier is a classifier defined by a pair (w, r) where w is a vector in {-1, 0, +1}n and r is a nonnegative real capturing the threshold or offset. The goal of the learning algorithm is to find a vector of weights in {-1, 0, +1}n that minimizes the hinge loss of the linear model from the training data. This problem is NP-hard and one approach consists in exactly solving the relaxed continuous problem and to heuristically derive discrete solutions. A recent paper by the authors has introduced a randomized rounding algorithm [1] and we propose in this paper more sophisticated algorithms that improve the generalization error. These algorithms are presented and their performances are experimentally analyzed. Our results show that this kind of compact model can address the complex problem of learning predictors from bioinformatics data such as metagenomics ones where the size of samples is much smaller than the number of attributes. The new algorithms presented improve the state of the art algorithm to learn ternary classifier. The source of power of this improvement is done at the expense of time complexity.
Keywords
bioinformatics; computational complexity; generalisation (artificial intelligence); learning (artificial intelligence); pattern classification; vectors; NP-hard; bioinformatics data; classification error minimization; decision model; discrete linear classifier learning; generalization error; metagenomics; randomized rounding algorithm; ternary classifier learning; time complexity; vector; Algorithm design and analysis; Classification algorithms; Data models; Error analysis; Fasteners; Prediction algorithms; Vectors; Metagenomics data; Randomized Rounding; Ternary Classifier;
fLanguage
English
Publisher
ieee
Conference_Titel
Computing & Communication Technologies - Research, Innovation, and Vision for the Future (RIVF), 2015 IEEE RIVF International Conference on
Conference_Location
Can Tho
Print_ISBN
978-1-4799-8043-7
Type
conf
DOI
10.1109/RIVF.2015.7049868
Filename
7049868
Link To Document