Abstract :
In this article I derive an alternative algorithm to Hudson and Kaplanʹs (Genetics 111, 147–165) algorithm that gives a lower bound to the number of recombination events in a sampleʹs history. It is shown that the number, TM, found by the algorithm is the least number of topologies required to explain a set of DNA sequences sampled under the infinite-site assumption. Let be a list of topologies compatible with the sequences, i.e., Tk is compatible with an interval, Ik, of sites in the alignment. A characterization of all lists having TM topologies is given and it is shown that TM relates to specific patterns in the alignment, here called chain series. Further, a number of theorems relating general lists of topologies to the number TM is presented. The results are discussed in relation to the true minimum number of recombination events required to explain an alignment.
Keywords :
algorithm , recombination , SNP , topology