DocumentCode :
1834218
Title :
The effect of speech and audio compression on speech recognition performance
Author :
Besacier, L. ; Bergamini, C. ; Vaufreydaz, D. ; Castelli, E.
Author_Institution :
Lab. CLIPS-IMAG, Univ. Joseph Fourier, Grenoble, France
fYear :
2001
fDate :
2001
Firstpage :
301
Lastpage :
306
Abstract :
This paper proposes an in-depth look at the influence of different speech and audio codecs on the performance of our continuous speech recognition engine. GSM full rate, G711, G723.1 and MPEG coders are investigated. It is shown that MPEG transcoding degrades the speech recognition performance for low bitrates whereas performance remains acceptable for specialized speech coders like GSM or G711. A new strategy is proposed to cope with degradation due to low bitrate coding. The acoustic models of the speech recognition system are trained with transcoded speech (one acoustic model for each speech/audio codec). First results show that this strategy allows one to recover acceptable performance
Keywords :
audio coding; cellular radio; code standards; data compression; speech codecs; speech coding; speech recognition; telecommunication equipment testing; telecommunication standards; G711 coders; G723.1 coders; GSM full rate coders; MPEG coders; MPEG transcoding; acoustic models; audio codec testing; audio codecs; audio compression; continuous speech recognition engine; low bitrate coding; speech codecs; speech compression; speech recognition performance; speech recognition system; Audio compression; Bit rate; Degradation; Engines; GSM; Mobile handsets; Network servers; Speech codecs; Speech coding; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Multimedia Signal Processing, 2001 IEEE Fourth Workshop on
Conference_Location :
Cannes
Print_ISBN :
0-7803-7025-2
Type :
conf
DOI :
10.1109/MMSP.2001.962750
Filename :
962750
Link To Document :
بازگشت