• DocumentCode
    2530792
  • Title

    Computational Identification of Protein-Coding Sequences by Comparative Analysis

  • Author

    Fontaine, Arnaud ; Touzet, Hélène

  • fYear
    2007
  • fDate
    2-4 Nov. 2007
  • Firstpage
    95
  • Lastpage
    102
  • Abstract
    Gene prediction is an essential step in understanding the genome of a species once it has been sequenced. For that, a promising direction in current research on gene finding is a comparative genomics approach. In this paper, we present a novel approach to identifying evolutionarily conserved protein-coding sequences in genomes. The method takes advantage of the specific substitution pattern of coding se- quences together with the consistency of reading frames. It has been implemented in a software called Protea. Large- scale experimentation shows good results. Protea is in- tended to be a useful complement to existing tools based on homology search or statistical properties of the sequences.
  • Keywords
    Bioinformatics; Biomedical computing; Databases; Genetic mutations; Genomics; Large-scale systems; Organisms; Proteins; RNA; Splicing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine, 2007. BIBM 2007. IEEE International Conference on
  • Conference_Location
    Fremont, CA
  • Print_ISBN
    978-0-7695-3031-4
  • Type

    conf

  • DOI
    10.1109/BIBM.2007.11
  • Filename
    4413042