• DocumentCode
    1065807
  • Title

    Biclustering algorithms for biological data analysis: a survey

  • Author

    Madeira, Sara C. ; Oliveira, Arlindo L.

  • Author_Institution
    Beira Interior Univ., Covilha, Portugal
  • Volume
    1
  • Issue
    1
  • fYear
    2004
  • Firstpage
    24
  • Lastpage
    45
  • Abstract
    A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results from the application of standard clustering methods to genes are limited. This limitation is imposed by the existence of a number of experimental conditions where the activity of genes is uncorrelated. A similar limitation exists when clustering of conditions is performed. For this reason, a number of algorithms that perform simultaneous clustering on the row and column dimensions of the data matrix has been proposed. The goal is to find submatrices, that is, subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated activities for every condition. In this paper, we refer to this class of algorithms as biclustering. Biclustering is also referred in the literature as coclustering and direct clustering, among others names, and has also been used in fields such as information retrieval and data mining. In this comprehensive survey, we analyze a large number of existing approaches to biclustering, and classify them in accordance with the type of biclusters they can find, the patterns of biclusters that are discovered, the methods used to perform the search, the approaches used to evaluate the solution, and the target applications.
  • Keywords
    biology computing; genetics; molecular biophysics; pattern clustering; biclustering algorithms; biological data analysis; coclustering; data mining; direct clustering; gene expression; gene subgroups; information retrieval; microarray experiments; Clustering algorithms; Clustering methods; Data analysis; Data mining; Gene expression; Information retrieval; Pattern analysis; Performance analysis; Performance evaluation; Semiconductor device measurement; Biclustering; bidimensional clustering; biological data analysis; block clustering; coclustering; direct clustering; gene expression data.; microarray data analysis; simultaneous clustering; subspace clustering; two-mode clustering; two-sided clustering; two-way clustering; Algorithms; Cluster Analysis; Computational Biology; Gene Expression; Gene Expression Profiling; Humans; Models, Statistical; Oligonucleotide Array Sequence Analysis; Saccharomyces cerevisiae;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2004.2
  • Filename
    1324618