DocumentCode :
2292596
Title :
Non-linear speech transition visualization
Author :
Reinhard, Klaus ; Niranjan, Mahesan
Author_Institution :
Dept. of Eng., Cambridge Univ., UK
fYear :
1997
fDate :
7-9 Jul 1997
Firstpage :
257
Lastpage :
261
Abstract :
Modelling context effects and segmental transitions in speech recognition systems is very important. Explicitly modelling segmental transitions in a RNN framework would circumvent these problems. We present an interesting application of Principal Curves, an algorithm to extract a non-linear summary of p-dimensional data firstly published by Hastie and Stuetzle (1989). The algorithm can be used to visualize non-linear transient characteristics in speech. We show that between-phone characteristics found within diphones can be used as discriminant information to distinguish ambiguous phones. The technique used is explained and illustrated on the examples /bah/, /dah/ and /gah/
Keywords :
recurrent neural nets; Principal Curves; RNN framework; ambiguous phones; between-phone characteristics; context effects; diphones; nonlinear speech transition visualization; recurrent neural networks; segmental transition; segmental transitions; short term spectral analysis; speech recognition systems; time-sequential process;
fLanguage :
English
Publisher :
iet
Conference_Titel :
Artificial Neural Networks, Fifth International Conference on (Conf. Publ. No. 440)
Conference_Location :
Cambridge
ISSN :
0537-9989
Print_ISBN :
0-85296-690-3
Type :
conf
DOI :
10.1049/cp:19970736
Filename :
607527
Link To Document :
بازگشت