DocumentCode
258112
Title
Optimal haplotype assembly with statistical pruning
Author
Das, Shreepriya ; Vikalo, Haris
Author_Institution
ECE Dept., Univ. of Texas at Austin, Austin, TX, USA
fYear
2014
fDate
3-5 Dec. 2014
Firstpage
1330
Lastpage
1333
Abstract
Solving the haplotype assembly problem by optimizing the commonly used minimum error correction criterion is known to be NP-hard. For this reason, suboptimal heuristics are often used in practice. In this paper, we propose a novel method for optimal haplotype assembly that is based on depth-first branch-and-bound search of the solution space. Our scheme is inspired by the sphere decodng algorithms used heavily in the field of digital communications. Using the statistical information about errors in sequencing data, we constrain the search of the haplotype space and speedily find the optimal solution to the haplotype assembly problem. Theoretical analysis of the expected complexity of the algorithm shows that optimal haplotype assembly is practically feasible for haplotype blocks of moderate lengths typically obtained using present day high throughput sequencers. The scheme is then tested on 1000 Genomes Project experimental data to verify the efficacy of the proposed method.
Keywords
biology computing; computational complexity; genomics; statistical analysis; NP-hard; digital communications; genomes project experimental data; haplotype assembly problem; high throughput sequencers; minimum error correction criterion; optimal haplotype assembly; sequencing data; sphere decodng algorithms; statistical information; statistical pruning; suboptimal heuristics; Assembly; Bioinformatics; Biological cells; Complexity theory; Genomics; Sequential analysis; Signal processing algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal and Information Processing (GlobalSIP), 2014 IEEE Global Conference on
Conference_Location
Atlanta, GA
Type
conf
DOI
10.1109/GlobalSIP.2014.7032339
Filename
7032339
Link To Document