Title :
Mining Relevant Sequence Patterns with CP-Based Framework
Author :
Kemmar, Amina ; Ugarte, Willy ; Loudni, Samir ; Charnois, Thierry ; Lebbah, Yahia ; Boizumault, Patrice ; Cremilleux, Bruno
Author_Institution :
GREYC, Univ. of Caen, Caen, France
Abstract :
Sequential pattern mining under various constraints is a challenging data mining task. The paper provides a generic framework based on constraint programming to discover sequence patterns defined by constraints on local patterns (e.g., Gap, regular expressions) or constraints on patterns involving combination of local patterns such as relevant subgroups and top-k patterns. This framework enables the user to mine in a declarative way both kinds of patterns. The solving step is done by exploiting the machinery of Constraint Programming. For complex patterns involving combination of local patterns, we improve the mining step by using dynamic CSP. Finally, we present two case studies in biomedical information extraction and stylistic analysis in linguistics.
Keywords :
constraint handling; data mining; CP-based framework; biomedical information extraction; complex patterns; constraint programming; data mining task; declarative pattern mining; dynamic CSP; gap constraint; generic framework; linguistics; local pattern constraints; regular expression constraint; sequence pattern discovery; sequence pattern mining; stylistic analysis; subgroup constraint; top-k pattern constraint; Automata; Biological system modeling; Data mining; Databases; Frequency measurement; Programming; Solids; Constraint programming; Sequential mining; subgroup patterns;
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2014 IEEE 26th International Conference on
Conference_Location :
Limassol
DOI :
10.1109/ICTAI.2014.89