DocumentCode :
586364
Title :
Annotation guided local similarity search in multiple sequences and its application to mitochondrial genomes
Author :
Moritz, Ruby L. V. ; Bernt, Matthias ; Middendorf, Martin
Author_Institution :
Parallel Comput. & Complex Syst. Group, Univ. Leipzig, Leipzig, Germany
fYear :
2012
fDate :
11-13 Nov. 2012
Firstpage :
157
Lastpage :
162
Abstract :
Given a set of nucleotide sequences and corresponding gene annotations which might contain a moderate number of errors we consider the problem to identify common substrings occurring in homologous genes and to identify putative errors in the given annotations. The problem is solved by identifying nodes in a suffix tree that contains all substrings occurring in the data set. Due to the large size of the targeted data set our approach employs a truncated version of suffix trees. The approach is successfully applied to the mitochondrial nucleotide sequences and the corresponding annotations available in RefSeq for more than 2000 metazoan species. We demonstrate that the approach finds appropriate subsequences despite of errors in the given annotations. Moreover, it identifies several hundred errors within the RefSeq annotations.
Keywords :
DNA; bioinformatics; data analysis; genomics; search problems; sequences; RefSeq annotations; annotation guided local similarity search; data analysis methods; gene annotations; homologous genes; mitochondrial genomes; mitochondrial nucleotide sequences; multiple sequences; nucleotide sequences; suffix tree; Encoding; Genomics; Indexes; Memory management; Proteins; Tree data structures; Vegetation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics & Bioengineering (BIBE), 2012 IEEE 12th International Conference on
Conference_Location :
Larnaca
Print_ISBN :
978-1-4673-4357-2
Type :
conf
DOI :
10.1109/BIBE.2012.6399666
Filename :
6399666
Link To Document :
بازگشت