• DocumentCode
    2762589
  • Title

    Fine Tuning the Enhanced Suffix Array

  • Author

    Abouelhoda, M.I. ; Dawood, A.

  • Author_Institution
    Fac. of Eng., Cairo Univ., Cairo
  • fYear
    2008
  • fDate
    18-20 Dec. 2008
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    The enhanced suffix array is an indexing data structure used for a wide range of applications in Bioinformatics. It is basically the suffix array but enhanced with extra tables that provide extra information to improve the performance in theory and in practice. In this paper, we present a number of improvements to the enhanced suffix array: 1) We show how to find a pattern of length m in O(m) time, i.e., independent of the alphabet size. 2) We present an improved representation of the bucket table. 3) We improve the access time of addressing the LCP (longest common prefix) table when one byte per entry is used in representing it. The basic idea behind these improvements is the extensive use of the minimal perfect hashing technique, by which n static items can be stored in linear space while retaining O(1) access time.
  • Keywords
    biology computing; computational complexity; data structures; O(m) time; bioinformatics; enhanced suffix array; indexing data structure; longest common prefix; Bioinformatics; Chemicals; DNA; Data engineering; Data structures; Indexing; Pattern matching; Proteins; Sequences; Table lookup;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Biomedical Engineering Conference, 2008. CIBEC 2008. Cairo International
  • Conference_Location
    Cairo
  • Print_ISBN
    978-1-4244-2694-2
  • Electronic_ISBN
    978-1-4244-2695-9
  • Type

    conf

  • DOI
    10.1109/CIBEC.2008.4786047
  • Filename
    4786047