مرکز منطقه ای اطلاع رساني علوم و فناوري - Frequency-time-shift-invariant time-delay neural networks for robust continuous speech recognition

DocumentCode :

1894335

Title :

Frequency-time-shift-invariant time-delay neural networks for robust continuous speech recognition

Author :

Sawai, Hidefumi

Author_Institution :

ATR Interpreting Telephony Res. Lab., Kyoto, Japan

fYear :

1991

fDate :

14-17 Apr 1991

Firstpage :

Abstract :

The authors propose neural network (NN) architectures for robust speaker-independent, continuous speech recognition. One architecture is the frequency-time-shift-invariant time-delay neural network (FTDNN). Another architecture is based on windowing each layer of the NN with local time-frequency windows. This architecture makes it possible for the NN to capture global features from the upper layers as well as precise local features from the lower layers. Recognition experiments on easily confused phonemes were performed using /b/, /d/, /g/, /m/, /n/, and /N/ (syllabic nasal) phoneme tokens to verify robustness to variations of speech. Performance results for the different architectures are presented

Keywords :

delays; neural nets; speech recognition; easily confused phonemes; frequency-time-shift-invariant time-delay neural network; global features; local time-frequency windows; lower layers; phoneme tokens; robust continuous speech recognition; upper layers; Ethics; Feature extraction; Laboratories; Neural networks; Performance evaluation; Robustness; Speech recognition; Telephony; Testing; Time frequency analysis;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on

Conference_Location :

Toronto, Ont.

ISSN :

1520-6149

Print_ISBN :

0-7803-0003-3

Type :

conf

DOI :

10.1109/ICASSP.1991.150274

Filename :

150274

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1894335