• DocumentCode
    627942
  • Title

    Acyclic Identification of Aptamer from Over-Represented Libraries Using Hash Functions

  • Author

    Yiou Xiao ; Mehrotra, Kishan G. ; Mohan, Chilukuri K. ; Borer, Phillip N. ; Allis, Damian G.

  • Author_Institution
    Dept. of EECS, Syracuse Univ., Syracuse, NY, USA
  • fYear
    2013
  • fDate
    5-7 April 2013
  • Firstpage
    179
  • Lastpage
    179
  • Abstract
    In recent years, with the advent of fast sequencing technology, the genomic database is growing rapidly. Researchers in bioinformatics field are expecting faster and more accurate tools to effectively analyze the gigantic data sets. In the context of aptamer search, the goal is to search for the over-represented DNA sequences compared with random background libraries on the same chip. Hash functions are widely used in substring comparison, sequence alignment and clustering tools. We have developed a light-weighted tool that takes advantage of the hash functions to reduce the size of genomic data and conducts k-neighbor searches on the centroid sequence. This greatly improves the efficiency of the search compared with the existing tool. Furthermore, the calculation of k-neighbor hash values decreases the mutant searching overhead. In a dataset of 1 million sequences, the program accurately counted the frequency of the Human alpha-Thrombin sequence and found the mutant versions of the target sequence in less than 40 seconds, whereas the existing method takes 8280 seconds (2 hours 13 minutes).
  • Keywords
    DNA; bioinformatics; genomics; molecular configurations; organic compounds; DNA sequences; acyclic aptamer identification; aptamer search; bioinformatic field; centroid sequence; clustering tools; fast sequencing technology; genomic data size; genomic database; gigantic data sets; hash functions; human alpha-Thrombin sequence; k-neighbor hash values; k-neighbor searches; light-weighted tool; random background libraries; sequence alignment; Bioinformatics; Biomedical engineering; DNA; Educational institutions; Genomics; Libraries; Sequential analysis; Apatmer; DNA; Hash; Overrepresented library;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioengineering Conference (NEBEC), 2013 39th Annual Northeast
  • Conference_Location
    Syracuse, NY
  • ISSN
    2160-7001
  • Print_ISBN
    978-1-4673-4928-4
  • Type

    conf

  • DOI
    10.1109/NEBEC.2013.2
  • Filename
    6574416