• DocumentCode
    1085661
  • Title

    Mixed Integer Linear Programming for Maximum-Parsimony Phylogeny Inference

  • Author

    Sridhar, Srinath ; Lam, Fumei ; Blelloch, Guy E. ; Ravi, R. ; Schwartz, Russell

  • Author_Institution
    Dept. of Comput. Sci., Carnegie Mellon Univ., Pittsburgh, PA
  • Volume
    5
  • Issue
    3
  • fYear
    2008
  • Firstpage
    323
  • Lastpage
    331
  • Abstract
    Reconstruction of phylogenetic trees is a fundamental problem in computational biology. While excellent heuristic methods are available for many variants of this problem, new advances in phylogeny inference will be required if we are to be able to continue to make effective use of the rapidly growing stores of variation data now being gathered. In this paper, we present two integer linear programming (ILP) formulations to find the most parsimonious phylogenetic tree from a set of binary variation data. One method uses a flow-based formulation that can produce exponential numbers of variables and constraints in the worst case. The method has, however, proven extremely efficient in practice on datasets that are well beyond the reach of the available provably efficient methods, solving several large mtDNA and Y-chromosome instances within a few seconds and giving provably optimal results in times competitive with fast heuristics than cannot guarantee optimality. An alternative formulation establishes that the problem can be solved with a polynomial-sized ILP. We further present a web server developed based on the exponential-sized ILP that performs fast maximum parsimony inferences and serves as a front end to a database of precomputed phylogenies spanning the human genome.
  • Keywords
    biology computing; genetics; heuristic programming; linear programming; Web server; Y-chromosome; binary variation data; computational biology; exponential-sized ILP; heuristic methods; human genome; maximum-parsimony phylogeny inference; mixed integer linear programming; mtDNA; phylogenetic tree reconstruction; precomputed phylogenies; Algorithms; Computational Biology; Integer Linear Programming; Maximum parsimony; Phylogenetic tree reconstruction; Steiner tree problem; Chromosome Mapping; Computer Simulation; Evolution; Evolution, Molecular; Linkage Disequilibrium; Models, Genetic; Phylogeny; Programming, Linear; Sequence Analysis, DNA;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2008.26
  • Filename
    4459305