DocumentCode :
2757460
Title :
Extraction of Ambiguous Sequential Patterns with Least Minimum Generalization from Mismatch Clusters
Author :
Araki, Kotaro ; Tamur, Keiichi ; Kato, Tomoyuki ; Mori, Yasuma ; Kitakami, Hajime
Author_Institution :
Grad. Sch. of Inf. Sci., Hiroshima City Univ., Hiroshima
fYear :
2007
fDate :
16-18 Dec. 2007
Firstpage :
35
Lastpage :
42
Abstract :
An ambiguous query in sequence databases returns a set of similar subsequences, called a mismatch cluster, to the user. The inherent problem is that it is difficult for users to identify the characteristics of very large similar subsequences in a mismatch cluster. In order to support user comprehension of mismatch clusters, it is important to extract a set of ambiguous sequence patterns with the least minimum generalization in the mismatch cluster. The extraction of the ambiguous sequential pattern set requires an enormous amount of computational time, since we have to discover generalized patterns with minimum covers for the mismatch cluster from candidate generalized patterns. The present paper is a proposal for an iterative refinement method to extract ambiguous sequence patterns with minimum cover for mismatch clusters selected from a sequence database. It includes a proposal to use the method with a domain segmentation method to achieve an efficient pattern extraction. Moreover, a prototype implementing the two proposed methods has been applied to three datasets included in PROSITE in order to evaluate their usefulness. The proposed methods resulted in a high capability to extract ambiguous sequential patterns from mismatch clusters that are provided by an ambiguous query in the sequence database.
Keywords :
database management systems; iterative methods; pattern classification; pattern clustering; ambiguous query; ambiguous sequential patterns extraction; domain segmentation method; iterative refinement method; least minimum generalization; mismatch clusters; pattern extraction; sequence databases; Internet;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal-Image Technologies and Internet-Based System, 2007. SITIS '07. Third International IEEE Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3122-9
Type :
conf
DOI :
10.1109/SITIS.2007.104
Filename :
4618756
Link To Document :
بازگشت