Title :
Integration of Multiple Feature Sets for Reducing Ambiguity in ASR
Author :
Rose, Rachel ; Momayyez, P.
Author_Institution :
Dept. of Electr. & Comput. Eng., McGill Univ., Montreal, Que., Canada
Abstract :
The main goal of this paper is to investigate the feasibility of exploiting the invariance properties associated with articulatory based acoustic features to reduce ambiguity in ASR search. A multivalued phonological feature set defined by King and Taylor is used along with a time delay neural network implementation of phonological feature detectors to produce eight independent phonological feature streams (S. King and P. Taylor, 2000). Hidden Markov models (HMMs) defined over these phonological feature streams are combined with HMMs defined over spectral energy based mel frequency cepstrum coefficient (MFCC) acoustic features through a lattice re-scoring procedure. It is shown that significant improvements in phone recognition accuracy are obtained for this combined system relative to phone accuracy obtained for MFCC based HMMs alone. A study is also performed to analyze the effects of uncertainty in phonological feature detection.
Keywords :
feature extraction; hidden Markov models; neural nets; speaker recognition; speech processing; ASR; acoustic features; feature detectors; hidden Markov models; lattice re-scoring procedure; mel frequency cepstrum coefficient; multiple feature sets; phone recognition; time delay neural network; Acoustic signal detection; Automatic speech recognition; Cepstrum; Computer vision; Delay effects; Detectors; Hidden Markov models; Lattices; Mel frequency cepstral coefficient; Neural networks; Acoustic Modeling; Phonological Features; Speech Recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0727-3
DOI :
10.1109/ICASSP.2007.366915