Title :
Clustering Microarray Data by Using a Stochastic Algorithm
Author :
Shon, Ho Sun ; Kim, Sunshin ; Shin, Seung Jung ; Ryu, Keun Ho
Author_Institution :
Database/Bioinf. Lab., Chungbuk Nat. Univ., Cheongju
Abstract :
The clustering of gene expression data is used to analyze the results of microarray studies. This method is often useful in understanding how a particular class of genes functions together during a biological process. In this study, we attempted to perform clustering using the Markov cluster (MCL) algorithm, a clustering method for graphs based on the simulation of stochastic flow. It is a fast and efficient algorithm that clusters nodes in a graph through simulation by computing probability. First, we converted the raw matrix into a sample matrix using the Euclidean distance of the genes between the samples. Second, we applied the MCL algorithm to the new matrix of Euclidean distance and considered 2 factors, namely, the inflation and diagonal terms of the matrix. We have turned to set the proper factors through massive experiments. In addition, distance thresholds, i.e., the average of each column data elements, were used to clearly distinguish between groups. Our experimental result shows about 70% accuracy in average compared to the class that is known before. We also compared the MCL algorithm with the self-organizing map (SOM) clustering, K-means clustering and hierarchical clustering (HC) algorithms.
Keywords :
Markov processes; biology computing; pattern clustering; self-organising feature maps; Euclidean distance; K-means clustering; Markov cluster algorithm; biological process; gene expression data clustering; hierarchical clustering algorithms; microarray data clustering; self-organizing map clustering; stochastic algorithm; stochastic flow; K-means; MCL algorithm; Microarray; SOM; hierarchical clustering;
Conference_Titel :
Computer and Information Technology Workshops, 2008. CIT Workshops 2008. IEEE 8th International Conference on
Conference_Location :
Sydney, QLD
Print_ISBN :
978-0-7695-3242-4
Electronic_ISBN :
978-0-7695-3239-1
DOI :
10.1109/CIT.2008.Workshops.117