DocumentCode
2762589
Title
Fine Tuning the Enhanced Suffix Array
Author
Abouelhoda, M.I. ; Dawood, A.
Author_Institution
Fac. of Eng., Cairo Univ., Cairo
fYear
2008
fDate
18-20 Dec. 2008
Firstpage
1
Lastpage
4
Abstract
The enhanced suffix array is an indexing data structure used for a wide range of applications in Bioinformatics. It is basically the suffix array but enhanced with extra tables that provide extra information to improve the performance in theory and in practice. In this paper, we present a number of improvements to the enhanced suffix array: 1) We show how to find a pattern of length m in O(m) time, i.e., independent of the alphabet size. 2) We present an improved representation of the bucket table. 3) We improve the access time of addressing the LCP (longest common prefix) table when one byte per entry is used in representing it. The basic idea behind these improvements is the extensive use of the minimal perfect hashing technique, by which n static items can be stored in linear space while retaining O(1) access time.
Keywords
biology computing; computational complexity; data structures; O(m) time; bioinformatics; enhanced suffix array; indexing data structure; longest common prefix; Bioinformatics; Chemicals; DNA; Data engineering; Data structures; Indexing; Pattern matching; Proteins; Sequences; Table lookup;
fLanguage
English
Publisher
ieee
Conference_Titel
Biomedical Engineering Conference, 2008. CIBEC 2008. Cairo International
Conference_Location
Cairo
Print_ISBN
978-1-4244-2694-2
Electronic_ISBN
978-1-4244-2695-9
Type
conf
DOI
10.1109/CIBEC.2008.4786047
Filename
4786047
Link To Document