Title :
Multivariate gene selection: does it help?
Author :
Lai, Carmen ; Reinders, Marcel ; Wessels, Lodewyk
Author_Institution :
Inf. & Commun. Theor. Group, Delft Univ. of Technol., Netherlands
Abstract :
When building predictors of disease state based on gene expression data, gene selection is performed in order to achieve a good performance and to identify a relevant subset of genes. Although several gene selection algorithms have been proposed, a fair comparison of the available results is very problematic. This mainly stems from two factors. First, the results are often biased, since the test set is in one way or another involved in training the predictor, resulting in optimistically biased performance estimates. Second, the published results are often based on a small number of relatively simple datasets. Therefore, no general applicative conclusions can be drawn. We therefore adopted an unbiased protocol to perform a fair comparison of state of the art multivariate and univariate gene selection techniques, in combination with a range of classifiers. Our conclusions are based on seven gene expression datasets, across many cancer types. Surprisingly, we could not detect any significant improvement of multivariate feature selection techniques over univariate approaches. We speculate on the possible causes of this finding, ranging from the small sample size problem to the particular nature of the multivariate gene dependencies.
Keywords :
cancer; cellular biophysics; feature extraction; genetics; molecular biophysics; tumours; cancer type; disease state; gene expression data; multivariate feature selection technique; multivariate gene dependency; multivariate gene selection; predictor training; univariate gene selection; Cancer; Clustering algorithms; Colon; Diseases; Gene expression; Protocols; Signal to noise ratio; Support vector machine classification; Support vector machines; System testing;
Conference_Titel :
Computational Systems Bioinformatics Conference, 2005. Workshops and Poster Abstracts. IEEE
Print_ISBN :
0-7695-2442-7
DOI :
10.1109/CSBW.2005.95