DocumentCode
2509410
Title
Gene Expression Analysis Using Clustering
Author
Dhiraj, Kumar ; Rath, Santanu Kumar ; Pandey, Abhishek
Author_Institution
Dept of Comput. Sci. & Eng., Nat. Inst. of Technol. Rourkela, Rourkela, India
fYear
2009
fDate
11-13 June 2009
Firstpage
1
Lastpage
4
Abstract
Data mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. In this paper, k-means clustering algorithm has been extensively studied for gene expression analysis. Since our purpose is to demonstrate the effectiveness of the k-means algorithm for a wide variety of data sets, we have chosen two pattern recognition data and thirteen microarray data sets with both overlapping and non-overlapping cluster boundaries, where the number of features/genes ranges from 4 to 7129 and number of sample ranges from 32 to 683. The number of clusters ranges from two to eleven. We use the clustering error rate (or, clustering accuracy) as evaluation metrics to measure the performance of k-means algorithm.
Keywords
data mining; genetics; lab-on-a-chip; medical computing; pattern clustering; biomedical industry; data mining; gene expression analysis; k-means clustering algorithm; microarray data sets; nonoverlapping cluster boundaries; overlapping cluster boundaries; pattern recognition; Breast; Cancer; Clustering algorithms; Clustering methods; Fungi; Gene expression; Iris; Lungs; Partitioning algorithms; Pattern recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics and Biomedical Engineering , 2009. ICBBE 2009. 3rd International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-2901-1
Electronic_ISBN
978-1-4244-2902-8
Type
conf
DOI
10.1109/ICBBE.2009.5162877
Filename
5162877
Link To Document