Title :
A new preprocessing procedure for the haplotype inference problem
Author :
Irurozki, Ekhine ; Lozano, José A.
Author_Institution :
Intell. Syst. Group, Univ. of the Basque Country, San Sebastian
Abstract :
A haplotype is a DNA sequence that is inherited from one parent. They are especially important in the study of complex diseases since they contain more information than genotype data, so the next high priority phase in human genomics involves the development of a full haplotype map of human genome. However, obtaining haplotype data is technically difficult and expensive. One of the computational methods for obtaining haplotype data from genotype data is the pure parsimony criterion, an approach known as haplotype inference by pure parsimony (HIPP). It has been proved to be an NP-hard problem. We present a new preprocessing method which drastically decreases the number of relevant haplotypes. Several algorithms need to preprocess data; for big problem instances this key procedure is even more important than the process. This preprocessing was eventually tested on real and simulated data applying a tabu search, and the performance of the resulting algorithm showed it to be competitive with the best actual solvers.
Keywords :
biocomputing; computational complexity; genomics; optimisation; search problems; DNA sequence; NP-hard problem; complex diseases; genotype data; haplotype inference by pure parsimony; human genomics; preprocessing procedure; tabu search; Bioinformatics; DNA; Discrete event simulation; Diseases; Genomics; Humans; Inference algorithms; NP-hard problem; Sequences; Testing;
Conference_Titel :
Evolutionary Computation, 2009. CEC '09. IEEE Congress on
Conference_Location :
Trondheim
Print_ISBN :
978-1-4244-2958-5
Electronic_ISBN :
978-1-4244-2959-2
DOI :
10.1109/CEC.2009.4983097