Title :
Multimodality as a criterion for feature selection in unsupervised analysis of gene expression data
Author :
Li, Yi ; Sung, Wing-Kin ; Miller, Lance D.
Author_Institution :
Genome Inst. of Singapore, Singapore
Abstract :
One important way that gene expression data are often analysed in an unsupervised way is to cluster the samples without reference to any annotations about them. Before clustering, the data are often subjected to a feature selection preprocessing step, in which a subset of genes are chosen for further analysis. We examine the use of multimodality as a criterion for choosing genes in feature selection, and also propose a novel measure of pairwise dissimilarity to cluster the genes that have survived the preprocessing step. The resulting multiple gene subsets usually contain those that are more strongly correlated with the sample annotations of interest than those obtained through variance-based feature selection. Class discovery may be facilitated when gene expression data are analysed using the proposed method.
Keywords :
biology computing; genetics; molecular biophysics; statistical analysis; class discovery; clustering; feature selection; feature selection preprocessing; gene expression; multimodality; pairwise dissimilarity; sample annotations; unsupervised analysis; Bioinformatics; Biological processes; Biomedical engineering; Clustering algorithms; Data analysis; Gene expression; Genomics; Partitioning algorithms; Performance evaluation; Web server;
Conference_Titel :
Bioinformatics and Bioengineering, 2005. BIBE 2005. Fifth IEEE Symposium on
Print_ISBN :
0-7695-2476-1
DOI :
10.1109/BIBE.2005.42