• DocumentCode
    831214
  • Title

    Efficient Parameterized Algorithms for Biopolymer Structure-Sequence Alignment

  • Author

    Song, Yinglei ; Liu, Chunmei ; Huang, Xiuzhen ; Malmberg, Russell L. ; Xu, Ying ; Cai, Liming

  • Author_Institution
    Dept. of Comput. Sci., Georgia Inst. of Technol., Athens, GA
  • Volume
    3
  • Issue
    4
  • fYear
    2006
  • Firstpage
    423
  • Lastpage
    432
  • Abstract
    Computational alignment of a biopolymer sequence (e.g., an RNA or a protein) to a structure is an effective approach to predict and search for the structure of new sequences. To identify the structure of remote homologs, the structure-sequence alignment has to consider not only sequence similarity, but also spatially conserved conformations caused by residue interactions and, consequently, is computationally intractable. It is difficult to cope with the inefficiency without compromising alignment accuracy, especially for structure search in genomes or large databases. This paper introduces a novel method and a parameterized algorithm for structure-sequence alignment. Both the structure and the sequence are represented as graphs, where, in general, the graph for a biopolymer structure has a naturally small tree width. The algorithm constructs an optimal alignment by finding in the sequence graph the maximum valued subgraph isomorphic to the structure graph. It has the computational time complexity O(k3N2) for the structure of N residues and its tree decomposition of width t. Parameter k, small in nature, is determined by a statistical cutoff for the correspondence between the structure and the sequence. This paper demonstrates a successful application of the algorithm to RNA structure search used for noncoding RNA identification. An application to protein threading is also discussed
  • Keywords
    biology computing; genetics; molecular biophysics; molecular configurations; polymers; proteins; RNA; biopolymer structure-sequence alignment; computational time complexity; genomes; graphs; parameterized algorithms; protein threading; residue interactions; sequence similarity; spatially conserved conformations; tree decomposition; Bioinformatics; Computational biology; Dynamic programming; Genomics; Heuristic algorithms; Linear programming; Proteins; RNA; Sequences; Tree graphs; RNA structure homology search; Structure-sequence alignment; dynamic programming; parameterized algorithm; protein threading.; tree decomposition; Algorithms; Base Sequence; Biopolymers; Computer Simulation; Models, Chemical; Models, Molecular; Molecular Sequence Data; Nucleic Acid Conformation; RNA; Sequence Alignment; Sequence Analysis, RNA; Structure-Activity Relationship;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2006.52
  • Filename
    4015383