Title :
Learning Constrained Edit State Machines
Author :
Boyer, Laurent ; Gandrillon, Olivier ; Habrard, Amaury ; Pellerin, Mathilde ; Sebban, Marc
Author_Institution :
Lab. Hubert Curien, Univ. de Lyon, Lyon, France
Abstract :
Learning the parameters of the edit distance has been increasingly studied during the past few years to improve the assessment of similarities between structured data, such as strings, trees or graphs. Often based on the optimization of the likelihood of pairs of data, the learned models usually take the form of probabilistic state machines, such as pair-Hidden Markov Models (pair-HMM), stochastic transducers, or probabilistic deterministic automata. Although the use of such models has lead to significant improvements of edit distance-based classification tasks, a new challenge has appeared on the horizon: How integrating background knowledge during the learning process? This is the subject matter of this paper in the case of (input,output) pairs of strings. We present a generalization of the pair-HMM in the form of a constrained state machine, where a transition between two states is driven by constraints fulfilled on the input string. Experimental results are provided on a task in molecular biology, aiming to detect transcription factor binding sites.
Keywords :
deterministic automata; finite state machines; hidden Markov models; learning (artificial intelligence); constrained state machines; edit distance; molecular biology; pair-hidden Markov models; probabilistic deterministic automata; probabilistic state machines; state machine learning; stochastic transducers; Artificial intelligence; Biological system modeling; Context modeling; Costs; Kernel; Learning automata; Machine learning; Stochastic processes; Transducers; Tree graphs;
Conference_Titel :
Tools with Artificial Intelligence, 2009. ICTAI '09. 21st International Conference on
Conference_Location :
Newark, NJ
Print_ISBN :
978-1-4244-5619-2
Electronic_ISBN :
1082-3409
DOI :
10.1109/ICTAI.2009.27