• DocumentCode
    1990681
  • Title

    HAMMER Algorithm: Hashing with Arithmetic Modulo-4 for Motif Extraction of Regulatory Elements

  • Author

    Sheng, Huitao ; Mehrotra, Kishan ; Mohan, Chilukuri ; Raina, Ramesh

  • Author_Institution
    Syracuse Univ., Syracuse
  • fYear
    2007
  • fDate
    14-17 Oct. 2007
  • Firstpage
    753
  • Lastpage
    758
  • Abstract
    A new algorithm, HAMMER, discovers cis-elements in promoter regions of the co-regulated genes. We show that HAMMER is faster and more accurate than well-known tools currently in use to identify cis-elements. Given input sequences that represent promoter regions of genes, this algorithm searches for subsequences of desired length w whose frequency of occurrence is relatively high, while accounting for slightly corrupted variants (with up to d substitutions). Various w-mers are numerically encoded and represented in a hash table, and d-neighbors are efficiently discovered using a modulo-4 arithmetic operation. Profile matrices are constructed and evaluated using a high-order Markov model based on background data (from a gene database). HAMMER discovers the most frequently occurring w-mers (permitting corruption in at most d positions). Experiment results show that HAMMER is significantly faster and discovers more motifs present in the test sequences, when compared with two well-known motif-discovery tools (MDScan and AlignACE).
  • Keywords
    biology computing; genetics; AlignACE; HAMMER algorithm; MDScan; cis-elements; genes; high-order Markov model; modulo-4 arithmetic operation; motif extraction; motif-discovery tools; w-mers; Background noise; Biology computing; Computational efficiency; Data mining; Data structures; Digital arithmetic; Frequency; Heuristic algorithms; Organisms; Sequences;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Bioengineering, 2007. BIBE 2007. Proceedings of the 7th IEEE International Conference on
  • Conference_Location
    Boston, MA
  • Print_ISBN
    978-1-4244-1509-0
  • Type

    conf

  • DOI
    10.1109/BIBE.2007.4375645
  • Filename
    4375645