• DocumentCode
    188622
  • Title

    A Decomposition Approach for Discovering Discriminative Motifs in a Sequence Database

  • Author

    Lesaint, David ; Mehta, Deepak ; O´Sullivan, Barry ; Vigneron, Vincent

  • Author_Institution
    LERIA, Univ. d´Angers, Angers, France
  • fYear
    2014
  • fDate
    10-12 Nov. 2014
  • Firstpage
    544
  • Lastpage
    551
  • Abstract
    Considerable effort has been invested over the years in ad-hoc algorithms for item set and pattern mining. Constraint programming has recently been proposed as a means to tackle item set mining tasks within a general modelling framework. We follow this approach to address the discovery of discriminative n-ary motifs in databases of labeled sequences. We define a n-ary motif as a mapping of n patterns to n class-wide embeddings and we restrict the interpretation of constraints on a motif to the sequences embedding all patterns. We formulate core constraints that minimize redundancy between motifs and introduce a general constraint optimization framework to compute common and exclusive motifs. We cast the discovery of closed and replication-free motifs in this framework for which we propose a two-stage approach based on constraint programming. Experimental results on datasets of protein sequences demonstrate the efficiency of the approach.
  • Keywords
    constraint handling; database management systems; ad hoc algorithms; constraint programming; discriminative motifs; general constraint optimization framework; general modelling framework; item set mining tasks; labeled sequence database; pattern mining; replication-free motifs; Complexity theory; Constraint optimization; Data mining; Itemsets; Solids; Bioinformatics; Constraint Programming; Motifs; Optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence (ICTAI), 2014 IEEE 26th International Conference on
  • Conference_Location
    Limassol
  • ISSN
    1082-3409
  • Type

    conf

  • DOI
    10.1109/ICTAI.2014.88
  • Filename
    6984524