DocumentCode :
1708916
Title :
Incorporating formant cues into distributed speech recognition systems
Author :
Norouzian, Atta ; Selouani, Sid-Ahmed ; Tolba, Hesham ; Shaughnessy, Douglas O.
Author_Institution :
INRS-EMT, Univ. of Quebec, Montreal, QC
fYear :
2008
Firstpage :
1159
Lastpage :
1162
Abstract :
The current front-end for distributed speech recognition (DSR) systems provided by European Telecommunications Standards Institute (ETSI) is mainly based on the state-of-the- art MFCC features. The method proposed in this paper aims to improve the performance of the present ETSI DSR-XAFE (XAFE: extended Audio Front-End). For this purpose two sets of acoustical features namely formant-like features and MFCC features are integrated under the multi-stream framework to form a feature vector which is more robust against additive noise. It is shown that for noisy speech, combining cepstral coefficients with main spectral peaks also known as formant-like features, using the multi-stream framework, leads to significant improvement in word recognition accuracy relative to word accuracy obtained for MFCCs alone.
Keywords :
cepstral analysis; noise; speech recognition; ETSI DSR-XAFE; European Telecommunications Standards Institute; MFCC features; additive noise; cepstral coefficients; distributed speech recognition systems; extended audio front-end; feature vector; formant cues; multistream framework; noisy speech; spectral peaks; word recognition accuracy; Additive noise; Automatic speech recognition; Cepstral analysis; Cepstrum; Data mining; Linear predictive coding; Mel frequency cepstral coefficient; Noise robustness; Speech recognition; Telecommunication standards;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications, Control and Signal Processing, 2008. ISCCSP 2008. 3rd International Symposium on
Conference_Location :
St Julians
Print_ISBN :
978-1-4244-1687-5
Electronic_ISBN :
978-1-4244-1688-2
Type :
conf
DOI :
10.1109/ISCCSP.2008.4537400
Filename :
4537400
Link To Document :
بازگشت