Title :
QColors: An algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads
Author :
Huang, Austin ; Kantor, Rami ; DeLong, Allison ; Schreier, Leeann ; Istrail, Sorin
Author_Institution :
Div. of Infectious Disease, Brown Univ. Providence, Providence, RI, USA
Abstract :
Next generation sequencing technologies have been successfully applied to HIV-infected patients in order to obtain the mutational spectrum of heterogeneous viral populations within individuals, known as quasispecies. However, the metage-nomics problem of quasispecies sequence reconstruction from next generation sequencing reads is not-yet widely applied in current practice and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by successfully applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data.
Keywords :
algorithm theory; constraint handling; diseases; genomics; graph colouring; graphs; microorganisms; HIV-infected patients; QColors; algorithm; conservative viral quasispecies reconstruction; constraint programming; graph representations; heterogeneous viral populations; intrapatient clonal HIV-1 sequencing data; metagenomics problem; mutational spectrum; noncontiguous next generation sequencing reads; paired-end sequencing; read length limitations; Bioinformatics; Error correction; Genomics; Human immunodeficiency virus; Image color analysis; Immune system; Programming; HIV; constraint programming; graph coloring; quasispecies; virology;
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1612-6
DOI :
10.1109/BIBMW.2011.6112365