Title :
A Decomposition Approach for Discovering Discriminative Motifs in a Sequence Database
Author :
Lesaint, David ; Mehta, Deepak ; O´Sullivan, Barry ; Vigneron, Vincent
Author_Institution :
LERIA, Univ. d´Angers, Angers, France
Abstract :
Considerable effort has been invested over the years in ad-hoc algorithms for item set and pattern mining. Constraint programming has recently been proposed as a means to tackle item set mining tasks within a general modelling framework. We follow this approach to address the discovery of discriminative n-ary motifs in databases of labeled sequences. We define a n-ary motif as a mapping of n patterns to n class-wide embeddings and we restrict the interpretation of constraints on a motif to the sequences embedding all patterns. We formulate core constraints that minimize redundancy between motifs and introduce a general constraint optimization framework to compute common and exclusive motifs. We cast the discovery of closed and replication-free motifs in this framework for which we propose a two-stage approach based on constraint programming. Experimental results on datasets of protein sequences demonstrate the efficiency of the approach.
Keywords :
constraint handling; database management systems; ad hoc algorithms; constraint programming; discriminative motifs; general constraint optimization framework; general modelling framework; item set mining tasks; labeled sequence database; pattern mining; replication-free motifs; Complexity theory; Constraint optimization; Data mining; Itemsets; Solids; Bioinformatics; Constraint Programming; Motifs; Optimization;
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2014 IEEE 26th International Conference on
Conference_Location :
Limassol
DOI :
10.1109/ICTAI.2014.88