DocumentCode
3093256
Title
Pattern Matching with Independent Wildcard Gaps
Author
Min, Fan ; Wu, Xindong ; Lu, Zhenyu
Author_Institution
Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
fYear
2009
fDate
12-14 Dec. 2009
Firstpage
194
Lastpage
199
Abstract
Pattern matching is fundamental in applications such as biological sequence analysis and text indexing. A wildcard gap matches any subsequence with a length between two user specified integers, therefore introducing much adaptability to patterns. However, most existing works require that gaps in a pattern be the same. In this paper, we define a new pattern matching problem where gaps are independently specified. The objective is to compute the number of all matches. Since this number is exponential with respect to the maximal gap flexibility and the pattern length, counting matches one by one is computationally infeasible. We develop an efficient algorithm, named pattern matching with independent wildcard gaps (PAIG) for this problem, and propose two approaches to enhance its performance further. For the final version, the time complexity is O(Ll2W2), where L is the sequence length, l is the pattern length, and W is the maximal gap flexibility. The space complexity is O(lW), making PAIG easy to run in a Java Applet. Experimental results validate the efficiency of PAIG and confirm our analysis about its different versions.
Keywords
computational complexity; pattern matching; Java Applet; biological sequence analysis; independent wildcard gaps; pattern matching problem; space complexity; text indexing; time complexity; Application software; Biology computing; Computer science; DNA; Data mining; Indexing; Java; Pattern matching; Sequences; USA Councils; Pattern matching; constraint; sequence; wildcard gap;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable, Autonomic and Secure Computing, 2009. DASC '09. Eighth IEEE International Conference on
Conference_Location
Chengdu
Print_ISBN
978-0-7695-3929-4
Electronic_ISBN
978-1-4244-5421-1
Type
conf
DOI
10.1109/DASC.2009.65
Filename
5380321
Link To Document