DocumentCode
3165340
Title
Discovering Non-redundant Overlapping Biclusters on Gene Expression Data
Author
Duy Tin Truong ; Battiti, Roberto ; Brunato, Mauro
fYear
2013
fDate
7-10 Dec. 2013
Firstpage
747
Lastpage
756
Abstract
Given a gene expression data matrix where each cell is the expression level of a gene under a certain condition, biclustering is the problem of searching for a subset of genes that co regulate and co express only under a subset of conditions. As some genes can belong to different functional categories, searching for non-redundant overlapping biclusters is an important problem in biclustering. However, most recent algorithms can only either produce disjoint biclusters or redundant biclusters with significant overlap. In other words, these algorithms do not allow users to specify the maximum overlap between the biclusters. In this paper, we propose a novel algorithm which can generate K overlapping biclusters where the maximum overlap between them is below a predefined threshold. Unlike the other approaches which often generate all biclusters at once, our algorithm produces the biclusters sequentially, where each newly generated bicluster is guaranteed to be different from the previous ones but can still overlap with them. The experiments on real datasets confirm that different meaningful overlapping biclusters are successfully discovered. Besides, under the same constraints, our algorithm returns much larger and higher-quality biclusters compared to those of the other state-of-the art algorithms.
Keywords
biology computing; data mining; pattern clustering; disjoint biclusters; functional category; gene expression data matrix; nonredundant overlapping bicluster discovery; Biological system modeling; Clustering algorithms; Coherence; Complexity theory; Gene expression; Search problems; gene expression data; non-redundant overlapping biclustering;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Conference_Location
Dallas, TX
ISSN
1550-4786
Type
conf
DOI
10.1109/ICDM.2013.36
Filename
6729559
Link To Document