DocumentCode
2718039
Title
Fast string matching with space-efficient word graphs
Author
Yata, Susumu ; Morita, Kazuhiro ; Fuketa, Masao ; Aoe, Jun-Ichi
Author_Institution
Inst. of Technol. & Sci., Univ. of Tokushima, Tokushima
fYear
2008
fDate
16-18 Dec. 2008
Firstpage
79
Lastpage
83
Abstract
String matching is one of the fundamentals in various text-processing applications such as text mining and content filtering systems. This paper describes a fast string matching algorithm using a compact pattern matching machine DAWG. A directed acyclic word graph (DAWG) is traditionally implemented with a 2-dimensional linked list or matrix. However, DAWGs with these structures have drawbacks, the lookup time of the linked list based one is slow and the space requirement of the matrix based one is large. Therefore, this paper proposes a novel DAWG based on a compacted double-array, which overcomes the drawbacks of traditional ones. Experimental results show that the novel DAWG is more efficient than traditional ones.
Keywords
data mining; directed graphs; pattern matching; text analysis; compact pattern matching machine; compacted double-array; content filtering systems; directed acyclic word graph; fast string matching; space-efficient word graphs; text mining; text-processing applications; Application specific integrated circuits; Dictionaries; Electronic mail; Field programmable gate arrays; Filtering; Handheld computers; Matched filters; Pattern matching; Space technology; Text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Innovations in Information Technology, 2008. IIT 2008. International Conference on
Conference_Location
Al Ain
Print_ISBN
978-1-4244-3396-4
Electronic_ISBN
978-1-4244-3397-1
Type
conf
DOI
10.1109/INNOVATIONS.2008.4781726
Filename
4781726
Link To Document