Title :
Pattern Matching with Independent Wildcard Gaps
Author :
Min, Fan ; Wu, Xindong ; Lu, Zhenyu
Author_Institution :
Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
Abstract :
Pattern matching is fundamental in applications such as biological sequence analysis and text indexing. A wildcard gap matches any subsequence with a length between two user specified integers, therefore introducing much adaptability to patterns. However, most existing works require that gaps in a pattern be the same. In this paper, we define a new pattern matching problem where gaps are independently specified. The objective is to compute the number of all matches. Since this number is exponential with respect to the maximal gap flexibility and the pattern length, counting matches one by one is computationally infeasible. We develop an efficient algorithm, named pattern matching with independent wildcard gaps (PAIG) for this problem, and propose two approaches to enhance its performance further. For the final version, the time complexity is O(Ll2W2), where L is the sequence length, l is the pattern length, and W is the maximal gap flexibility. The space complexity is O(lW), making PAIG easy to run in a Java Applet. Experimental results validate the efficiency of PAIG and confirm our analysis about its different versions.
Keywords :
computational complexity; pattern matching; Java Applet; biological sequence analysis; independent wildcard gaps; pattern matching problem; space complexity; text indexing; time complexity; Application software; Biology computing; Computer science; DNA; Data mining; Indexing; Java; Pattern matching; Sequences; USA Councils; Pattern matching; constraint; sequence; wildcard gap;
Conference_Titel :
Dependable, Autonomic and Secure Computing, 2009. DASC '09. Eighth IEEE International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-0-7695-3929-4
Electronic_ISBN :
978-1-4244-5421-1
DOI :
10.1109/DASC.2009.65