DocumentCode :
109023
Title :
Minimax Sparse Logistic Regression for Very High-Dimensional Feature Selection
Author :
Mingkui Tan ; Tsang, Ivor W. ; Li Wang
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
Volume :
24
Issue :
10
fYear :
2013
fDate :
Oct. 2013
Firstpage :
1609
Lastpage :
1622
Abstract :
Because of the strong convexity and probabilistic underpinnings, logistic regression (LR) is widely used in many real-world applications. However, in many problems, such as bioinformatics, choosing a small subset of features with the most discriminative power are desirable for interpreting the prediction model, robust predictions or deeper analysis. To achieve a sparse solution with respect to input features, many sparse LR models are proposed. However, it is still challenging for them to efficiently obtain unbiased sparse solutions to very high-dimensional problems (e.g., identifying the most discriminative subset from millions of features). In this paper, we propose a new minimax sparse LR model for very high-dimensional feature selections, which can be efficiently solved by a cutting plane algorithm. To solve the resultant nonsmooth minimax subproblems, a smoothing coordinate descent method is presented. Numerical issues and convergence rate of this method are carefully studied. Experimental results on several synthetic and real-world datasets show that the proposed method can obtain better prediction accuracy with the same number of selected features and has better or competitive scalability on very high-dimensional problems compared with the baseline methods, including the l1-regularized LR.
Keywords :
bioinformatics; minimax techniques; probability; regression analysis; bioinformatics; cutting plane algorithm; l1-regularized LR; minimax sparse LR model; minimax sparse logistic regression; probabilistic underpinnings; real-world datasets; resultant nonsmooth minimax subproblems; smoothing coordinate descent method; strong convexity; synthetic datasets; very high-dimensional feature selection; Feature selection; minimax problem; single-nucleotide polymorphism (SNP) detection; smoothing method; sparse logistic regression;
fLanguage :
English
Journal_Title :
Neural Networks and Learning Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
2162-237X
Type :
jour
DOI :
10.1109/TNNLS.2013.2263427
Filename :
6542036
Link To Document :
بازگشت