• DocumentCode
    1080299
  • Title

    Counting All Possible Ancestral Configurations of Sample Sequences in Population Genetics

  • Author

    Song, Y.S. ; Lyngso, R. ; Hein, J.

  • Author_Institution
    Dept. of Comput. Sci., California Univ., Davis, CA
  • Volume
    3
  • Issue
    3
  • fYear
    2006
  • Firstpage
    239
  • Lastpage
    251
  • Abstract
    Given a set D of input sequences, a genealogy for D can be constructed backward in time using such evolutionary events as mutation, coalescent, and recombination. An ancestral configuration (AC) can be regarded as the multiset of all sequences present at a particular point in time in a possible genealogy for D. The complexity of computing the likelihood of observing D depends heavily on the total number of distinct ACs of D and, therefore, it is of interest to estimate that number. For D consisting of binary sequences of finite length, we consider the problem of enumerating exactly all distinct ACs. We assume that the root sequence type is known and that the mutation process is governed by the infinite-sites model. When there is no recombination, we construct a general method of obtaining closed-form formulas for the total number of ACs. The enumeration problem becomes much more complicated when recombination is involved. In that case, we devise a method of enumeration based on counting contingency tables and construct a dynamic programming algorithm for the approach. Last, we describe a method of counting the number of ACs that can appear in genealogies with less than or equal to a given number R of recombinations. Of particular interest is the case in which R is close to the minimum number of recombinations for D
  • Keywords
    biology computing; cellular biophysics; dynamic programming; genetics; molecular biophysics; molecular configurations; physiological models; ancestral configurations; coalescent; contingency tables; dynamic programming algorithm; evolutionary events; genealogy; infinite-sites model; mutation; population genetics; recombination; sample sequences; AC generators; Binary sequences; Binary trees; Dynamic programming; Genetic mutations; Heuristic algorithms; Mathematical model; Stochastic processes; Tree graphs; Ancestral configurations; coalescent; contingency table; enumeration.; recombination; Biological Evolution; Chromosome Mapping; Evolution, Molecular; Genetic Variation; Genetics, Population; Models, Genetic; Models, Statistical; Pedigree; Phylogeny; Sample Size; Sequence Alignment; Sequence Analysis, DNA;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2006.31
  • Filename
    1668023