Title :
Indexing and retrieval for genomic databases
Author :
Williams, Hugh E. ; Zobel, Justin
Author_Institution :
Dept. of Comput. Sci., R. Melbourne Inst. of Technol., Vic., Australia
Abstract :
Genomic sequence databases are widely used by molecular biologists for homology searching. Amino acid and nucleotide databases are increasing in size exponentially, and mean sequence lengths are also increasing. In searching such databases, it is desirable to use heuristics to perform computationally intensive local alignments on selected sequences and to reduce the costs of the alignments that are attempted. We present an index-based approach for both selecting sequences that display broad similarity to a query and for fast local alignment. We show experimentally that the indexed approach results in significant savings in computationally intensive local alignments and that index-based searching is as accurate as existing exhaustive search schemes
Keywords :
data structures; database indexing; information retrieval; scientific information systems; abstract-genomic sequence databases; computationally intensive local alignments; genomic databases retrieval; heuristics; homology searching; index-based approach; index-based searching; mean sequence lengths; molecular biologists; scientific databases; Bioinformatics; Databases; Genomics; Indexing; Information retrieval;
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on