DocumentCode :
893846
Title :
Learning Finite-State Transducers: Evolution Versus Heuristic State Merging
Author :
Lucas, Simon M. ; Reynolds, T. Jeff
Author_Institution :
Dept. of Comput. Sci., Essex Univ., Colchester
Volume :
11
Issue :
3
fYear :
2007
fDate :
6/1/2007 12:00:00 AM
Firstpage :
308
Lastpage :
325
Abstract :
Finite-state transducers (FSTs) are finite-state machines (FSMs) that map strings in a source domain into strings in a target domain. While there are many reports in the literature of evolving FSMs, there has been much less work on evolving FSTs. In particular, the fitness functions required for evolving FSTs are generally different from those used for FSMs. In this paper, three string distance-based fitness functions are evaluated, in order of increasing computational complexity: string equality, Hamming distance, and edit distance. The fitness-distance correlation (FDC) and evolutionary performance of each fitness function is analyzed when used within a random mutation hill-climber (RMHC). Edit distance has the strongest FDC and also provides the best evolutionary performance, in that it is more likely to find the target FST within a given number of fitness function evaluations. Edit distance is also the most expensive to compute, but in most cases this extra computation is more than justified by its performance. The RMHC was compared with the best known heuristic method for learning FSTs, the onward subsequential transducer inference algorithm (OSTIA). On noise-free data, the RMHC performs best on problems with sparse training sets and small target machines. The RMHC and OSTIA offer similar performance for large target machines and denser data sets. When noise-corrupted data is used for training, the RMHC still performs well, while OSTIA performs poorly given even small amounts of noise. The RMHC is also shown to outperform a genetic algorithm. Hence, for certain classes of FST induction problem, the RMHC presented in this paper offers the best performance of any known algorithm
Keywords :
computational complexity; evolutionary computation; finite state machines; learning (artificial intelligence); Hamming distance; computational complexity; finite-state machines; finite-state transducers; fitness-distance correlation; onward subsequential transducer inference algorithm; random mutation hill-climber; sparse training sets; string equality; Application software; Computational complexity; Genetic mutations; Hamming distance; Humans; Inference algorithms; Machine learning; Merging; Performance analysis; Transducers; Finite-state transducer (FST); random mutation hill-climber (RMHC); state merging; string distance; string translation;
fLanguage :
English
Journal_Title :
Evolutionary Computation, IEEE Transactions on
Publisher :
ieee
ISSN :
1089-778X
Type :
jour
DOI :
10.1109/TEVC.2006.880329
Filename :
4220679
Link To Document :
بازگشت