DocumentCode :
1352759
Title :
The LASSO and Sparse Least Squares Regression Methods for SNP Selection in Predicting Quantitative Traits
Author :
Feng, Z.Z. ; Xiaojian Yang ; Subedi, S. ; McNicholas, P.D.
Author_Institution :
Dept. of Math. & Stat., Univ. of Guelph, Guelph, ON, Canada
Volume :
9
Issue :
2
fYear :
2012
Firstpage :
629
Lastpage :
636
Abstract :
Recent work concerning quantitative traits of interest has focused on selecting a small subset of single nucleotide polymorphisms (SNPs) from among the SNPs responsible for the phenotypic variation of the trait. When considered as covariates, the large number of variables (SNPs) and their association with those in close proximity pose challenges for variable selection. The features of sparsity and shrinkage of regression coefficients of the least absolute shrinkage and selection operator (LASSO) method appear attractive for SNP selection. Sparse partial least squares (SPLS) is also appealing as it combines the features of sparsity in subset selection and dimension reduction to handle correlations among SNPs. In this paper, we investigate application of the LASSO and SPLS methods for selecting SNPs that predict quantitative traits. We evaluate the performance of both methods with different criteria and under different scenarios using simulation studies. Results indicate that these methods can be effective in selecting SNPs that predict quantitative traits but are limited by some conditions. Both methods perform similarly overall but each exhibit advantages over the other in given situations. Both methods are applied to Canadian Holstein cattle data to compare their performance.
Keywords :
bioinformatics; genetics; least squares approximations; regression analysis; Canadian Holstein cattle; LASSO; SNP selection; SPLS; bioinformatics; least absolute shrinkage and selection operator; phenotypic variation; quantitative traits; regression coefficients; single nucleotide polymorphisms; sparse least squares regression methods; sparse partial least squares; sparsity; Accuracy; Bioinformatics; Biological cells; Correlation; Input variables; Predictive models; Training; Bioinformatics; regression analysis; statistical computing.; Algorithms; Animals; Cattle; Computational Biology; Computer Simulation; Least-Squares Analysis; Models, Genetic; Polymorphism, Single Nucleotide; Quantitative Trait Loci;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2011.139
Filename :
6051425
Link To Document :
بازگشت