DocumentCode
3230320
Title
Biclustering of gene expression data by simulated annealing
Author
Chakraborty, Anupam
Author_Institution
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kharagpur
fYear
2005
fDate
1-1 July 2005
Lastpage
632
Abstract
A bicluster of a gene expression dataset is a subset of genes which exhibit similar expression patterns along a subset of conditions. Biclustering algorithms aim at finding subsets of genes and subsets of conditions, such that a single cellular process is the main contributor to the expression of the gene subset over the condition subset. We believe that the size of biclusters should be small compared to the size of the gene expression data matrix and we have observed that a conceptually simpler way to perform biclustering from gene expression data is to apply standard oneway clustering algorithms to the rows and columns of the data matrix separately and then to combine the results to obtain bicluster seeds. Our algorithm has three steps. First, we generate a set of high quality bicluster seeds. In the second phase, these bicluster seeds are enlarged by adding more genes and conditions using a simulated annealing based technique. In the third phase, we find the p-values of the biclusters produced for statistical validation
Keywords
biology computing; cellular biophysics; genetics; molecular biophysics; simulated annealing; bicluster seeds; biclustering algorithms; cellular process; gene expression data matrix; kmeans clustering; oneway clustering algorithms; p-value; simulated annealing; statistical validation; Clustering algorithms; Computational modeling; Computer science; DNA; Data analysis; Data engineering; Gene expression; Interference; Iterative algorithms; Simulated annealing;
fLanguage
English
Publisher
ieee
Conference_Titel
High-Performance Computing in Asia-Pacific Region, 2005. Proceedings. Eighth International Conference on
Conference_Location
Beijing
Print_ISBN
0-7695-2486-9
Type
conf
DOI
10.1109/HPCASIA.2005.25
Filename
1592333
Link To Document