• DocumentCode
    3714658
  • Title

    PACH: Ploidy-AgnostiC Haplotyping

  • Author

    Sepideh Mazrouee

  • Author_Institution
    Computer Science Department, University of California Los Angeles, 3551 Boelter Hall, 90095-1596, United States
  • fYear
    2015
  • Firstpage
    1786
  • Lastpage
    1788
  • Abstract
    Organisms can be categorized based on the copy number of each chromosome they have. In genomic studies, diploid organisms such as humans, mice, etc. have been the focus of extensive research for decades. Organisms with more than two sets of homologous chromosomes, however, have received attention from the community only recently, in studying the genomics of disease, phylogenetic, and evolution studies. The presence of more than two copies of each chromosome in the cells of an organism which is common in plants, some animals, and human body tissues is referred to as Polyploidy. To understand structure of each chromosome, haplotype assembly is needed. Current computational algorithms for phasing, however, either focus on diploid organisms or fail to accurately reconstruct haplotypes on polyploidy organisms. This has limited scalability and generalizability of such algorithms. Therefore, there is a need to develop new algorithms that are not only accurate in reconstructing chromosome copies from DNA sequencing data but also can be applied to organisms of various ploidy levels. In this paper, we present PACH, a novel and ploidy-agnostic phasing framework. PACH is a fragment partitioning approach based on a fragment conflict graph model to quantify inter-fragment dissimilarities. We introduce a partitioning approach followed by a partition merging technique to accurately group similar fragments into any number of partitions depending on the ploidy level of the organism from which the sequencing data are derived. Our preliminary results demonstrate that PACH outperforms the state-of-the-art computational techniques. The amount of improvement in the MEC (Minimum Error Correction) score ranges from 82 to 98% using triploid, tetraploid, and decaploid data.
  • Keywords
    "Genomics","Bioinformatics","Biological cells","DNA"
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BIBM.2015.7359963
  • Filename
    7359963