• DocumentCode
    3360227
  • Title

    Development and implementation of CARAS algorithm for automatic annotation, visualization, and GenBank submission of chloroplast genome sequences

  • Author

    Li, Yankai ; Li, Huan ; Zhu, Yingjie ; Li, Zhoujun ; Yin, Chuntao ; Lin, Xiaohan ; Liu, Chang

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Beihang Univ., Beijing, China
  • fYear
    2012
  • fDate
    11-13 Jan. 2012
  • Firstpage
    310
  • Lastpage
    315
  • Abstract
    We present CARAS, a web server that allows the automatic annotation of a chloroplast genome sequence, and the visualization and editing of the annotation results interactively and in real-time. CARAS accepts a complete chloroplast genome sequence as input. First, it accurately predicts protein-coding sequences and exon-intron structures by combining the results from two types of annotation approaches: ab initio prediction algorithms and similarity based methods. Second, tRNA genes and inverted repeats are identified using tRNAscan and vmatch. Using 220 chloroplast genome sequences as test, we show that CARAS outperforms a similar application DOGMA overall. Third, the annotation results are presented as a circular map that includes the name, location, orientation, and length of the various features. A Flex-based module is implemented, allowing the users to add, delete, and edit the features on-line. The results are stored on the server and can be retrieved from a given URL. Users also have options to download the annotation results in GFF3 format for further analyses in other third party software tools or in JPEG format for publication. Finally, CARAS can be used to create a Sequin file for GenBank submission of the annotated genome sequence. CARAS is freely available at http://caras.bicoup.com, and it will significantly facilitate research work involving chloroplast genomes.
  • Keywords
    Internet; RNA; biology computing; data visualisation; file servers; genomics; proteins; CARAS algorithm; DOGMA; Flex based module; GFF3 format; GenBank submission; JPEG format; Sequin file; URL; Web server; automatic annotation; automatic visualization; chloroplast genome sequences; exon-intron structures; protein coding sequences; tRNA genes; tRNAscan; third party software tools; vmatch; Bioinformatics; DNA; Genomics; Pipelines; Prediction algorithms; Proteins; Web servers; annotation; chloroplast genome; editing; visualization; web server;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing, Communications and Applications Conference (ComComAp), 2012
  • Conference_Location
    Hong Kong
  • Print_ISBN
    978-1-4577-1717-8
  • Type

    conf

  • DOI
    10.1109/ComComAp.2012.6154863
  • Filename
    6154863