DocumentCode
1188348
Title
Recognizing GSM digital speech
Author
Gallardo-Antolín, Ascensión ; Peláez-Moreno, Carmen ; Díaz-de-María, Fernando
Author_Institution
Signal Theor. & Commun. Dept., Univ. Carlos de Madrid, Spain
Volume
13
Issue
6
fYear
2005
Firstpage
1186
Lastpage
1205
Abstract
The Global System for Mobile (GSM) environment encompasses three main problems for automatic speech recognition (ASR) systems: noisy scenarios, source coding distortion, and transmission errors. The first one has already received much attention; however, source coding distortion and transmission errors must be explicitly addressed. In this paper, we propose an alternative front-end for speech recognition over GSM networks. This front-end is specially conceived to be effective against source coding distortion and transmission errors. Specifically, we suggest extracting the recognition feature vectors directly from the encoded speech (i.e., the bitstream) instead of decoding it and subsequently extracting the feature vectors. This approach offers two significant advantages. First, the recognition system is only affected by the quantization distortion of the spectral envelope. Thus, we are avoiding the influence of other sources of distortion as a result of the encoding-decoding process. Second, when transmission errors occur, our front-end becomes more effective since it is not affected by errors in bits allocated to the excitation signal. We have considered the half and the full-rate standard codecs and compared the proposed front-end with the conventional approach in two ASR tasks, namely, speaker-independent isolated digit recognition and speaker-independent continuous speech recognition. In general, our approach outperforms the conventional procedure, for a variety of simulated channel conditions. Furthermore, the disparity increases as the network conditions worsen.
Keywords
cellular radio; code standards; combined source-channel coding; decoding; distortion; feature extraction; quantisation (signal); radio networks; speech codecs; speech coding; speech recognition; GSM network; bit allocation; decoding process; digital speech recognition; error transmission; feature extraction; full-rate standard codec; global system for mobile; quantization distortion; source coding distortion; speech encoding; tandeming; wireless network; Automatic speech recognition; Code standards; Codecs; Decoding; Feature extraction; GSM; Quantization; Source coding; Speech recognition; Working environment noise; Coding distortion; Global System for Mobile (GSM) networks; speech coding; speech recognition; tandeming; transmission errors; wireless networks;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/TSA.2005.853210
Filename
1518918
Link To Document