DocumentCode :
2950409
Title :
Benchmarking of gene prediction programs for metagenomic data
Author :
Yok, Non ; Rosen, Gail
Author_Institution :
Electr. & Comput. Eng. Dept., Drexel Univ., Philadelphia, PA, USA
fYear :
2010
fDate :
Aug. 31 2010-Sept. 4 2010
Firstpage :
6190
Lastpage :
6193
Abstract :
This manuscript presents the most rigorous benchmarking of gene annotation algorithms for metagenomic datasets to date. We compare three different programs: GeneMark, MetaGeneAnnotator (MGA) and Orphelia. The comparisons are based on their performances over simulated fragments from one hundred species of diverse lineages. We defined four different types of fragments; two types come from the inter- and intra-coding regions and the other types are from the gene edges. Hoff et al. used only 12 species in their comparison; therefore, their sample is too small to represent an environmental sample. Also, no predecessors has separately examined fragments that contain gene edges as opposed to intra-coding regions. General observations in our results are that performances of all these programs improve as we increase the length of the fragment. On the other hand, intra-coding fragments of our data show low annotation error in all of the programs if compared to the gene edge fragments. Overall, we found an upper-bound performance by combining all the methods.
Keywords :
bioinformatics; genetics; genomics; GeneMark; MetaGeneAnnotator; Orphelia; benchmarking; gene annotation algorithms; gene edge fragments; gene prediction programs; intra-coding fragments; intracoding regions; metagenomic data; Benchmark testing; Bioinformatics; Encoding; Genomics; Hidden Markov models; Measurement uncertainty; Sensitivity; Algorithms; Benchmarking; Databases, Genetic; Metagenomics; Molecular Sequence Annotation; ROC Curve;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of the IEEE
Conference_Location :
Buenos Aires
ISSN :
1557-170X
Print_ISBN :
978-1-4244-4123-5
Type :
conf
DOI :
10.1109/IEMBS.2010.5627744
Filename :
5627744
Link To Document :
بازگشت