DocumentCode :
2507446
Title :
EPBC: Enhanced Possibilistic Biclustering with Application to Gene Expression Analysis
Author :
Mahfouz, Mohamed A. ; Ismail, Mohamed A.
Author_Institution :
Dept. of Comput. & Syst. Eng., Alexandria Univ., Alexandria, Egypt
fYear :
2009
fDate :
11-13 June 2009
Firstpage :
1
Lastpage :
6
Abstract :
Biclustering is an important data mining technique that allows identifying groups of genes which behave similarly under a subset of conditions for analyzing gene expression data from microarray technology. As a gene may play more than one biological role in conjunction with distinct groups of genes, possibilistic biclustering algorithms can give much insight towards different biological processes that each gene might participate into, along with providing a degree of participation as well, and the conditions under which its participation is most effective. This paper proposes modifications to the possibilistic biclustering algorithm introduced by Maurizio Filippone, et. al, in 2004 termed as PBC in which the mean square residue is minimized and at the same time the size of a bicluster is maximized by computing the zeros of the derivative of their objective function with respect to rows and columns memberships. Their algorithm suffers from some serious drawback. First in computing the derivative of their objective function they consider the residue as a constant even though changing a membership of a row or a column affects the residue of each entry in the bicluster since it changes the average of the whole bicluster and the average of each column or row respectively. Furthermore, their algorithm is strongly sensitive to its two input parameters. In this paper the derivatives are accurately computed also their objective function is modified such that only single parameter is needed which allow us to develop a procedure for approximating a range for suitable values for this parameter. Whereas the accurate computation of the derivatives slightly increases the runtime of the proposed algorithm, experimental study on yeast and several artificial datasets with embedded constant and additive modules having different noise levels shows that our algorithm can offer substantial improvements in terms of the quality of the output biclusters over several previously proposed bicl- ustering algorithms.
Keywords :
biology computing; data mining; fuzzy set theory; genetics; mean square error methods; pattern clustering; possibility theory; EPBC; artificial dataset; biological process; data mining technique; enhanced possibilistic biclustering; fuzzy clustering; gene expression analysis; mean square method; microarray technology; yeast; Application software; Biological processes; Clustering algorithms; Data analysis; Data engineering; Data mining; Embedded computing; Gene expression; Iterative algorithms; Systems engineering and theory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedical Engineering , 2009. ICBBE 2009. 3rd International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-2901-1
Electronic_ISBN :
978-1-4244-2902-8
Type :
conf
DOI :
10.1109/ICBBE.2009.5162791
Filename :
5162791
Link To Document :
بازگشت