Title :
A dynamic cepstrum incorporating time-frequency masking and its application to continuous speech recognition
Author :
Aikawa, Kiyoaki ; Singer, Harald ; Kawahara, Hideki ; Tohkura, Yoh´ichi
Author_Institution :
ATR Auditory & Visual Perception Lab., Soraku-gun, Kyoto, Japan
Abstract :
A dynamic cepstrum parameter that incorporates the time-frequency characteristics of auditory forward masking is proposed. A masking model is derived from psychological experimental results. A novel operational method using a lifter array is derived to perform the time-frequency masking. The parameter simulates the effective input spectrum at the front-end of the auditory system and can enhance the spectral dynamics. The parameter represents both the instantaneous and transitional aspects of a spectral time series. Phoneme and continuous speech recognition experiments demonstrated that the dynamic cepstrum outperforms the conventional cepstrum individually and in various combinations with other spectral parameters. The phoneme recognition results were improved for ten male and ten female speakers. The masking lifter with a Gaussian window provided a better performance than that with a square window.<>
Keywords :
array signal processing; physiological models; speech recognition; time-frequency analysis; Gaussian window; auditory forward masking; continuous speech recognition; dynamic cepstrum; lifter array; performance; phoneme recognition; spectral time series; time-frequency masking;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.1993.319399