• DocumentCode
    952318
  • Title

    Gene Clustering via Integrated Markov Models Combining Individual and Pairwise Features

  • Author

    Vignes, Matthieu ; Forbes, Florence

  • Author_Institution
    BioSS, Scottish Crop Res. Inst., Dundee
  • Volume
    6
  • Issue
    2
  • fYear
    2009
  • Firstpage
    260
  • Lastpage
    270
  • Abstract
    Clustering of genes into groups sharing common characteristics is a useful exploratory technique for a number of subsequent computational analysis. A wide range of clustering algorithms have been proposed in particular to analyze gene expression data, but most of them consider genes as independent entities or include relevant information on gene interactions in a suboptimal way. We propose a probabilistic model that has the advantage to account for individual data (e.g., expression) and pairwise data (e.g., interaction information coming from biological networks) simultaneously. Our model is based on hidden Markov random field models in which parametric probability distributions account for the distribution of individual data. Data on pairs, possibly reflecting distance or similarity measures between genes, are then included through a graph, where the nodes represent the genes, and the edges are weighted according to the available interaction information. As a probabilistic model, this model has many interesting theoretical features. In addition, preliminary experiments on simulated and real data show promising results and points out the gain in using such an approach. Availability: The software used in this work is written in C++ and is available with other supplementary material at http://mistis.inrialpes.fr/people/forbes/transparentia/supplementary.html.
  • Keywords
    Markov processes; bioinformatics; cellular biophysics; genetics; genomics; molecular biophysics; pattern clustering; probability; bioinformatics; clustering algorithms; gene clustering; gene expression; gene interactions; genomics; hidden Markov random field models; integrated Markov models; parametric probability distributions; probabilistic model; Markov random fields; gene expression; gene expression.; metabolic networks; model-based clustering; Algorithms; Cluster Analysis; Computer Simulation; Gene Expression Profiling; Gene Regulatory Networks; Glycolysis; Markov Chains; Metabolic Networks and Pathways; Multigene Family; RNA Polymerase II; Saccharomyces cerevisiae; Software;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2007.70248
  • Filename
    4359897