DocumentCode :
3245357
Title :
Mel-cepstrum modulation spectrum (MCMS) features for robust ASR
Author :
Tyagi, Vivek ; McCowan, Iain ; Misra, Hemant ; Bourlard, Hervé
Author_Institution :
Dalle Molle Inst. for Perceptual Artificial Intelligence, Martigny, Switzerland
fYear :
2003
fDate :
30 Nov.-3 Dec. 2003
Firstpage :
399
Lastpage :
404
Abstract :
In this paper, we present new dynamic features derived from the modulation spectrum of the cepstral trajectories of the speech signal. Cepstral trajectories are projected over the basis of sines and cosines yielding the cepstral modulation frequency response of the speech signal. We show that the different sines and cosines basis vectors select different modulation frequencies, whereas the frequency responses of the delta and the double delta filters are only centered over 15 Hz. Therefore, projecting cepstral trajectories over the basis of sines and cosines yield a more complementary and discriminative range of features. In this work, the cepstrum reconstructed from the lower cepstral modulation frequency components is used as the static feature. In experiments, it is shown that, as well as providing an improvement in clean conditions, these new dynamic features yield a significant increase in the speech recognition performance in various noise conditions when compared directly to the standard temporal derivative features and C-JRASTA PLP features.
Keywords :
band-pass filters; cepstral analysis; feature extraction; frequency response; modulation; signal reconstruction; speech intelligibility; speech processing; speech recognition; MCMS features; Mel-cepstrum modulation spectrum; cepstral modulation frequency response; cepstral trajectories; cepstrum reconstruction; cosine basis vectors; delta filters; double delta filters; noise conditions; robust ASR; sine basis vectors; speech intelligibility; speech recognition performance; speech signal; Automatic speech recognition; Band pass filters; Cepstral analysis; Cepstrum; Delta modulation; Frequency modulation; Frequency response; Noise robustness; Nonlinear filters; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
Type :
conf
DOI :
10.1109/ASRU.2003.1318474
Filename :
1318474
Link To Document :
بازگشت