Title :
A time-frequency segmental neural network for phoneme recognition
Author :
Basu, Anjan ; Svendsen, Torbjorn
Author_Institution :
Dept. of Telecommun., Norwegian Inst. of Technol., Trondheim, Norway
Abstract :
The authors propose a time-frequency segmental neural network (TFSNN) which classifies phonemes according to the two-dimensional time frequency distribution of the whole phonetic segment. It uses a network architecture similar to those used for optical character recognition (OCR) to provide local shift invariance along both the time and the frequency axis. The TFSNN can be used in place of a segmental neural network (SNN) in a hybrid hidden Markov model (HMM) artificial neural network (ANN) system for automatic speech recognition as it shows significantly better performance than the SNN. The training times for the TFSNN is also smaller as it employs very few connection weights compared with the SNN.<>
Keywords :
hidden Markov models; learning (artificial intelligence); neural nets; speech recognition; time-frequency analysis; automatic speech recognition; connection weights; hidden Markov model; local shift invariance; network architecture; performance; phoneme recognition; time-frequency segmental neural network; training times;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.1993.319167