Title :
LVQ-based shift-tolerant phoneme recognition
Author :
McDermott, Erik ; Katagiri, Shigeru
Author_Institution :
ATR Visual Perception Res. Labs., Kyoto, Japan
fDate :
6/1/1991 12:00:00 AM
Abstract :
A shift-tolerant neural network architecture for phoneme recognition is described. The system is based on algorithms for learning vector quantization (LVQ), recently developed by Kohonen (1986, 1988), which pay close attention to approximating optimal decision lines in a discrimination task. Recognition performances in the 98%-99% correct range were obtained for LVQ networks aimed at speaker-dependent recognition of phonemes in small but ambiguous Japanese phonemic classes. A correct recognition rate of 97.7% was achieved by a large LVQ network covering all Japanese consonants. These recognition results are as good as those obtained in the time delay neural network system developed by Waibel et al. (1989), and suggest that LVQ could be the basis for a high-performance speech recognition system
Keywords :
data compression; encoding; learning systems; neural nets; speech recognition; Japanese consonants; Japanese phonemic classes; LVQ-based shift-tolerant phoneme recognition; discrimination task; learning vector quantization; optimal decision lines; performances; shift-tolerant neural network architecture; speech recognition; Euclidean distance; Probability density function;
Journal_Title :
Signal Processing, IEEE Transactions on