Title :
Construction, enumeration, and optimization of perfect phylogenies on multi-state data
Author :
Michael Coulombe;Kristian Stevens;Dan Gusfield
Author_Institution :
Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, 02139, United States
Abstract :
Perfect phylogenies are central to both evolutionary biology and population genetics. We implemented and evaluated algorithms for constructing, counting, and enumerating perfect phylogenies on data with an arbitrary number of states. Ours is the first program to implement the efficient algorithm of Agarwala and Fernández-Baca (1994) with the speedups and enumeration extensions by Kannan and Warnow (1995). It is written in the C++ language and uses specialized algorithms and datastructures for faster and more compact execution. We have included new extensions to the previously described algorithms. Our software can efficiently construct a phylogeny, determine it´s uniqueness, or determine that no phylogeny exists. It can handle input data with missing values and find a largest subset of compatible characters. It can count and enumerate the potentially exponential number of trees that may explain an input dataset. Using dynamic programming, it can find a smallest tree or a tree with maximum edge support. While many of these problems have been shown to be NP hard, our implementations are demonstrably practical for many datasets. Our software can be downloaded at http://wwwcsif.cs.ucdavis.edu/ gusfield.
Keywords :
"Phylogeny","Indexes","Clustering algorithms","Software algorithms","Optimization","Labeling"
Conference_Titel :
Computational Advances in Bio and Medical Sciences (ICCABS), 2015 IEEE 5th International Conference on
DOI :
10.1109/ICCABS.2015.7344709