• DocumentCode
    2718039
  • Title

    Fast string matching with space-efficient word graphs

  • Author

    Yata, Susumu ; Morita, Kazuhiro ; Fuketa, Masao ; Aoe, Jun-Ichi

  • Author_Institution
    Inst. of Technol. & Sci., Univ. of Tokushima, Tokushima
  • fYear
    2008
  • fDate
    16-18 Dec. 2008
  • Firstpage
    79
  • Lastpage
    83
  • Abstract
    String matching is one of the fundamentals in various text-processing applications such as text mining and content filtering systems. This paper describes a fast string matching algorithm using a compact pattern matching machine DAWG. A directed acyclic word graph (DAWG) is traditionally implemented with a 2-dimensional linked list or matrix. However, DAWGs with these structures have drawbacks, the lookup time of the linked list based one is slow and the space requirement of the matrix based one is large. Therefore, this paper proposes a novel DAWG based on a compacted double-array, which overcomes the drawbacks of traditional ones. Experimental results show that the novel DAWG is more efficient than traditional ones.
  • Keywords
    data mining; directed graphs; pattern matching; text analysis; compact pattern matching machine; compacted double-array; content filtering systems; directed acyclic word graph; fast string matching; space-efficient word graphs; text mining; text-processing applications; Application specific integrated circuits; Dictionaries; Electronic mail; Field programmable gate arrays; Filtering; Handheld computers; Matched filters; Pattern matching; Space technology; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Innovations in Information Technology, 2008. IIT 2008. International Conference on
  • Conference_Location
    Al Ain
  • Print_ISBN
    978-1-4244-3396-4
  • Electronic_ISBN
    978-1-4244-3397-1
  • Type

    conf

  • DOI
    10.1109/INNOVATIONS.2008.4781726
  • Filename
    4781726