Title :
Stochastic motif extraction using a genetic algorithm with the MDL principle
Author :
Konagaya, Akihiko ; Kondou, Hiroyasu
Author_Institution :
NEC Corp., Kanagawa, Japan
Abstract :
A new methodology to extract stochastic motifs from protein sequences is proposed. Instead of pursuing precise motifs, the authors are trying to extract stochastic motifs that inherently include exceptions and are more suitable for representing important regions. J. Rissanen´s (1978) minimum description length (MDL) principle is used as the quantitative criterion to avoid overfiltering to sample sequences. To avoid combinatorial explosion in motif extraction, a genetic algorithm is used, which is a kind of probablistic search algorithm based on the biological evolution process. The experimental results demonstrate that the MDL principle greatly increases the convergence speed of a genetic algorithm when extracting stochastic motifs.
Keywords :
convergence; genetic algorithms; pattern recognition; proteins; search problems; stochastic processes; convergence speed; evolution; exceptions; genetic algorithm; important regions; minimum description length; overfiltering; probablistic search algorithm; protein sequences; stochastic motif extraction; Amino acids; Bonding; Convergence; Evolution (biology); Explosions; Genetic algorithms; Laboratories; National electric code; Proteins; Stochastic processes;
Conference_Titel :
System Sciences, 1993, Proceeding of the Twenty-Sixth Hawaii International Conference on
Print_ISBN :
0-8186-3230-5
DOI :
10.1109/HICSS.1993.270666