DocumentCode :
2018275
Title :
A time-frequency segmental neural network for phoneme recognition
Author :
Basu, Anjan ; Svendsen, Torbjorn
Author_Institution :
Dept. of Telecommun., Norwegian Inst. of Technol., Trondheim, Norway
Volume :
1
fYear :
1993
fDate :
27-30 April 1993
Firstpage :
509
Abstract :
The authors propose a time-frequency segmental neural network (TFSNN) which classifies phonemes according to the two-dimensional time frequency distribution of the whole phonetic segment. It uses a network architecture similar to those used for optical character recognition (OCR) to provide local shift invariance along both the time and the frequency axis. The TFSNN can be used in place of a segmental neural network (SNN) in a hybrid hidden Markov model (HMM) artificial neural network (ANN) system for automatic speech recognition as it shows significantly better performance than the SNN. The training times for the TFSNN is also smaller as it employs very few connection weights compared with the SNN.<>
Keywords :
hidden Markov models; learning (artificial intelligence); neural nets; speech recognition; time-frequency analysis; automatic speech recognition; connection weights; hidden Markov model; local shift invariance; network architecture; performance; phoneme recognition; time-frequency segmental neural network; training times;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.1993.319167
Filename :
319167
Link To Document :
بازگشت