• DocumentCode
    84560
  • Title

    Maximizing Deep Coalescence Cost

  • Author

    Gorecki, Pawel ; Eulenstein, Oliver

  • Author_Institution
    Dept. of Math., Inf. & Mech., Univ. of Warsaw, Warsaw, Poland
  • Volume
    11
  • Issue
    1
  • fYear
    2014
  • fDate
    Jan.-Feb. 2014
  • Firstpage
    231
  • Lastpage
    242
  • Abstract
    The minimizing deep coalescence (MDC) problem seeks a species tree that reconciles the given gene trees with the minimum number of deep coalescence events, called deep coalescence (DC) cost. To better assess MDC species trees we investigate into a basic mathematical property of the DC cost, called the diameter. Given a gene tree, a species tree, and a leaf labeling function that assigns leaf-genes of the gene tree to a leaf-species in the species tree from which they were sampled, the DC cost describes the discordance between the trees caused by deep coalescence events. The diameter of a gene tree and a species tree is the maximum DC cost across all leaf labelings for these trees. We prove fundamental mathematical properties describing precisely these diameters for bijective and general leaf labelings, and present efficient algorithms to compute the diameters and their corresponding leaf labelings. In particular, we describe an optimal, i.e., linear time, algorithm for the bijective case. Finally, in an experimental study we demonstrate that the average diameters between a gene tree and a species tree grow significantly slower than their naive upper bounds, suggesting that our exact bounds can significantly improve on assessing DC costs when using diameters.
  • Keywords
    bioinformatics; evolution (biological); genetics; trees (mathematics); MDC problem; MDC species tree; basic mathematical property; bijective leaf labelings; deep coalescence cost maximization; diameter computation algorithms; gene tree diameter; gene trees; general leaf labelings; leaf labeling computation algorithms; leaf labeling function; leaf-gene assignment; leaf-species; maximum DC cost; minimizing deep coalescence problem; minimum deep coalescence event number; naive upper bounds; optimal linear time algorithm; species tree diameter; Bioinformatics; Computational biology; Joining processes; Labeling; Phylogeny; Shape; Vegetation; Deep coalescence; bijective leaf labeling; cost function; diameter; gene tree; leaf labeling; species tree; tree reconciliation;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2013.144
  • Filename
    6657669