• DocumentCode
    1865469
  • Title

    Design of Fast Multiple String Searching Based on Improved Prefix Tree

  • Author

    Cheng, Yu ; Zhang, Tao

  • Author_Institution
    Dept. of Biomed. Eng., Tsinghua Univ., Beijing, China
  • fYear
    2010
  • fDate
    9-10 Jan. 2010
  • Firstpage
    111
  • Lastpage
    114
  • Abstract
    Multi-string matching is one of the most important components in data mining task. New applications in many technology fields require high performance string matching algorithms. This paper first presents a new string searching approach based on a data structure called prefix tree. The innovative algorithm eliminates the functional overlap of the table HASH and Prefix Function. Then we make a little improvement on the prefix tree and present a second algorithm that is faster and more space-saving. It is demonstrated analytically that the two algorithms inherit the optimality and are very competitive in practice. On tests of both real life and synthetic data, our algorithms are also efficient and especially effective for various string pattern and large alphabet sets.
  • Keywords
    data mining; data structures; string matching; data mining; data structure; fast multiple string searching design; improved prefix tree function; multistring matching algorithm; synthetic data; table hash function; Algorithm design and analysis; Application software; Automation; Biomedical engineering; Computer science; Data mining; Life testing; Pattern matching; Payloads; Tree data structures; Multi-string matching; prefix tree; string pattern;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Knowledge Discovery and Data Mining, 2010. WKDD '10. Third International Conference on
  • Conference_Location
    Phuket
  • Print_ISBN
    978-1-4244-5397-9
  • Electronic_ISBN
    978-1-4244-5398-6
  • Type

    conf

  • DOI
    10.1109/WKDD.2010.138
  • Filename
    5432708