DocumentCode :
1050339
Title :
Class Discovery From Gene Expression Data Based on Perturbation and Cluster Ensemble
Author :
Yu, Zhiwen ; Wong, Hau-San
Author_Institution :
Lab. of Intell. Comput., South China Univ. of Technol., Guangzhou, China
Volume :
8
Issue :
2
fYear :
2009
fDate :
6/1/2009 12:00:00 AM
Firstpage :
147
Lastpage :
160
Abstract :
Class discovery from gene expression data is an important task for cancer diagnosis. In this paper, we present a new framework for class discovery. The new framework integrates the perturbation technique, the cluster ensemble approach, and the cluster validity index. Specifically, it first generates a set of perturbed datasets from the original microarray data. Then, the Neural Gas, which serves as the basic clustering algorithm, is applied to obtain the partitions from the original dataset and the perturbed datasets. Finally, a new cluster validity index called disagreement/agreement (DA) index (DAI) is designed to identify the number of classes in the dataset by considering the difference between the partition obtained from the original dataset and the partitions obtained from the perturbed datasets. The experiments in three synthetic datasets and four cancer datasets show that: (1) DAI successfully discovers the underlying structure from all the synthetic datasets and most of the cancer datasets and (2) DAI outperforms most of the state-of-the-art cluster validity indexes when applied to gene expression data.
Keywords :
bioinformatics; cancer; genomics; medical computing; pattern classification; pattern clustering; Neural Gas clustering algorithm; cancer diagnosis; class discovery; cluster ensemble approach; cluster validity index; disagreement-agreement index; gene expression data; microarray data; perturbation technique; perturbed dataset generation; Class discover; cluster ensemble; cluster validity index; microarray; perturbation; Algorithms; Cluster Analysis; Gene Expression Profiling; Humans; Neoplasm Proteins; Neoplasms; Oligonucleotide Array Sequence Analysis; Pattern Recognition, Automated; Signal Transduction; Tumor Markers, Biological;
fLanguage :
English
Journal_Title :
NanoBioscience, IEEE Transactions on
Publisher :
ieee
ISSN :
1536-1241
Type :
jour
DOI :
10.1109/TNB.2009.2023321
Filename :
5061507
Link To Document :
بازگشت