• DocumentCode
    3602020
  • Title

    Infer Metagenomic Abundance and Reveal Homologous Genomes Based on the Structure of Taxonomy Tree

  • Author

    Yu-Qing Qiu ; Xue Tian ; Shihua Zhang

  • Author_Institution
    Nat. Center for Math. & Interdiscipl. Sci., Acad. of Math. & Syst. Sci., Beijing, China
  • Volume
    12
  • Issue
    5
  • fYear
    2015
  • Firstpage
    1112
  • Lastpage
    1122
  • Abstract
    Metagenomic research uses sequencing technologies to investigate the genetic biodiversity of microbiomes presented in various ecosystems or animal tissues. The composition of a microbial community is highly associated with the environment in which the organisms exist. As large amount of sequencing short reads of microorganism genomes obtained, accurately estimating the abundance of microorganisms within a metagenomic sample is becoming an increasing challenge in bioinformatics. In this paper, we describe a hierarchical taxonomy tree-based mixture model (HTTMM) for estimating the abundance of taxon within a microbial community by incorporating the structure of the taxonomy tree. In this model, genome-specific short reads and homologous short reads among genomes can be distinguished and represented by leaf and intermediate nodes in the taxonomy tree, respectively. We adopt an expectation-maximization algorithm to solve this model. Using simulated and real-world data, we demonstrate that the proposed method is superior to both flat mixture model and lowest common ancestry-based methods. Moreover, this model can reveal previously unaddressed homologous genomes.
  • Keywords
    bioinformatics; biological tissues; expectation-maximisation algorithm; genetics; genomics; microorganisms; mixture models; animal tissues; bioinformatics; ecosystems; expectation-maximization algorithm; genetic biodiversity; hierarchical taxonomy tree-based mixture model; homologous genomes; metagenomic research; microbial community; microbiomes; microorganism genomes; sequencing technologies; Bioinformatics; Databases; Genomics; Microorganisms; Taxonomy; Vegetation; Metagenomics; abundance estimation; expectation-maximization algorithm; taxonomy tree;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2015.2415814
  • Filename
    7095552