Title of article :
Cross model validated feature selection based on gene clusters
Author/Authors :
Gidskehaug، نويسنده , , Lars and Anderssen، نويسنده , , Endre and Alsberg، نويسنده , , Bjّrn K.، نويسنده ,
Issue Information :
دوفصلنامه با شماره پیاپی سال 2006
Abstract :
A framework is presented for feature selection of expression data that regards clusters of genes with similar expression rather than each gene individually. Predictive models based on coherent sets of genes are believed to be more robust than models in which each gene is treated separately. There is also evidence that such procedures may be able to detect differential expression in genes that would otherwise go undetected. The interpretation of the results is much simplified as the significant genes are ordered in groups that may represent biological relationships.
minant partial least squares regression is used for classifying two leukaemia subtypes. Clusters from a hierarchical clustering are tested for significance by jack-knife. Cross model validation is used both to detect the optimal partitioning of genes and to validate the feature selection. A predictive model based on 24 out of 500 initial clusters proved to outperform a model based on single genes for the presented data. Some of the selected clusters were shown to be biologically meaningful, others may give clues to functional relationships.
Keywords :
Hierarchical clustering , Microarrays , feature selection , PLSR , Cross model validation , False discovery rate
Journal title :
Chemometrics and Intelligent Laboratory Systems
Journal title :
Chemometrics and Intelligent Laboratory Systems