• DocumentCode
    1484567
  • Title

    Multiscale Binarization of Gene Expression Data for Reconstructing Boolean Networks

  • Author

    Hopfensitz, M. ; Mussel, C. ; Wawra, C. ; Maucher, M. ; Kuhl, M. ; Neumann, H. ; Kestler, H.A.

  • Author_Institution
    Res. Group of Bioinf. & Syst. Biol., Ulm Univ., Ulm, Germany
  • Volume
    9
  • Issue
    2
  • fYear
    2012
  • Firstpage
    487
  • Lastpage
    498
  • Abstract
    Network inference algorithms can assist life scientists in unraveling gene-regulatory systems on a molecular level. In recent years, great attention has been drawn to the reconstruction of Boolean networks from time series. These need to be binarized, as such networks model genes as binary variables (either "expressed” or "not expressed”). Common binarization methods often cluster measurements or separate them according to statistical or information theoretic characteristics and may require many data points to determine a robust threshold. Yet, time series measurements frequently comprise only a small number of samples. To overcome this limitation, we propose a binarization that incorporates measurements at multiple resolutions. We introduce two such binarization approaches which determine thresholds based on limited numbers of samples and additionally provide a measure of threshold validity. Thus, network reconstruction and further analysis can be restricted to genes with meaningful thresholds. This reduces the complexity of network inference. The performance of our binarization algorithms was evaluated in network reconstruction experiments using artificial data as well as real-world yeast expression time series. The new approaches yield considerably improved correct network identification rates compared to other binarization techniques by effectively reducing the amount of candidate networks.
  • Keywords
    Boolean functions; binary sequences; biology computing; genetics; inference mechanisms; microorganisms; molecular biophysics; time series; Boolean networks; binarization; binary variables; gene expression data; gene-regulatory systems; multiscale binarization; network inference; network inference algorithms; yeast expression time series; Approximation error; Bioinformatics; Complexity theory; Computational biology; Gene expression; Time measurement; Time series analysis; Binarization; Boolean networks; gene-regulatory networks; reconstruction.; Algorithms; Computational Biology; Databases, Genetic; Gene Expression Profiling; Gene Regulatory Networks; Models, Genetic; Saccharomyces cerevisiae;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2011.62
  • Filename
    5740845