DocumentCode
1834218
Title
The effect of speech and audio compression on speech recognition performance
Author
Besacier, L. ; Bergamini, C. ; Vaufreydaz, D. ; Castelli, E.
Author_Institution
Lab. CLIPS-IMAG, Univ. Joseph Fourier, Grenoble, France
fYear
2001
fDate
2001
Firstpage
301
Lastpage
306
Abstract
This paper proposes an in-depth look at the influence of different speech and audio codecs on the performance of our continuous speech recognition engine. GSM full rate, G711, G723.1 and MPEG coders are investigated. It is shown that MPEG transcoding degrades the speech recognition performance for low bitrates whereas performance remains acceptable for specialized speech coders like GSM or G711. A new strategy is proposed to cope with degradation due to low bitrate coding. The acoustic models of the speech recognition system are trained with transcoded speech (one acoustic model for each speech/audio codec). First results show that this strategy allows one to recover acceptable performance
Keywords
audio coding; cellular radio; code standards; data compression; speech codecs; speech coding; speech recognition; telecommunication equipment testing; telecommunication standards; G711 coders; G723.1 coders; GSM full rate coders; MPEG coders; MPEG transcoding; acoustic models; audio codec testing; audio codecs; audio compression; continuous speech recognition engine; low bitrate coding; speech codecs; speech compression; speech recognition performance; speech recognition system; Audio compression; Bit rate; Degradation; Engines; GSM; Mobile handsets; Network servers; Speech codecs; Speech coding; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia Signal Processing, 2001 IEEE Fourth Workshop on
Conference_Location
Cannes
Print_ISBN
0-7803-7025-2
Type
conf
DOI
10.1109/MMSP.2001.962750
Filename
962750
Link To Document