DocumentCode
2018275
Title
A time-frequency segmental neural network for phoneme recognition
Author
Basu, Anjan ; Svendsen, Torbjorn
Author_Institution
Dept. of Telecommun., Norwegian Inst. of Technol., Trondheim, Norway
Volume
1
fYear
1993
fDate
27-30 April 1993
Firstpage
509
Abstract
The authors propose a time-frequency segmental neural network (TFSNN) which classifies phonemes according to the two-dimensional time frequency distribution of the whole phonetic segment. It uses a network architecture similar to those used for optical character recognition (OCR) to provide local shift invariance along both the time and the frequency axis. The TFSNN can be used in place of a segmental neural network (SNN) in a hybrid hidden Markov model (HMM) artificial neural network (ANN) system for automatic speech recognition as it shows significantly better performance than the SNN. The training times for the TFSNN is also smaller as it employs very few connection weights compared with the SNN.<>
Keywords
hidden Markov models; learning (artificial intelligence); neural nets; speech recognition; time-frequency analysis; automatic speech recognition; connection weights; hidden Markov model; local shift invariance; network architecture; performance; phoneme recognition; time-frequency segmental neural network; training times;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location
Minneapolis, MN, USA
ISSN
1520-6149
Print_ISBN
0-7803-7402-9
Type
conf
DOI
10.1109/ICASSP.1993.319167
Filename
319167
Link To Document