Title :
On the Subset Matching
Author_Institution :
University of Winnipeg, Canada
Abstract :
The subset matching problem is to find all occurrences of a pattern string p of length m in a text string t of length n, where each pattern and text location is a set of characters drawn from some alphabet. The pattern is said to occur at text position i if the set p[j] is a subset of the set t[i + j - 1], for all j (1 leqslant j leqslant m). This is an extension of the classic string matching and can be used for finding matching subtree patterns. In this paper, we propose a new algorithm and conduct a probabilistic analysis of its performance, which shows that we need only O(n + m cdot n^{0.5} ) time to solve the problem on average.
Keywords :
Algorithm design and analysis; Character generation; Computer science; Data engineering; Image processing; Information retrieval; Pattern analysis; Pattern matching; Performance analysis; Tree data structures; probabilistic analysis.; signature trees; signatures; string matching; subset matching; tree pattern matching;
Conference_Titel :
Web-Age Information Management Workshops, 2006. WAIM '06. Seventh International Conference on
Conference_Location :
Hong Kong, China
Print_ISBN :
0-7695-2705-1
DOI :
10.1109/WAIMW.2006.21