DocumentCode :
1502200
Title :
A bitstream-based front-end for wireless speech recognition on IS-136 communications system
Author :
Kim, Hong Kook ; Cox, Richard V.
Author_Institution :
AT&T Labs.-Res., Florham Park, NJ, USA
Volume :
9
Issue :
5
fYear :
2001
fDate :
7/1/2001 12:00:00 AM
Firstpage :
558
Lastpage :
568
Abstract :
We propose a feature extraction method for a speech recognizer that operates in digital communication networks. The feature parameters are basically extracted by converting the quantized spectral information of a speech coder into a cepstrum. We also include the voiced/unvoiced information obtained from the bitstream of the speech coder in the recognition feature set. We performed speaker-independent connected digit HMM recognition experiments under clean, background noise, and channel impairment conditions. From these results, we found that the speech recognition system employing the proposed bitstream-based front-end gives superior word and string accuracies over a recognizer constructed from decoded speech signals. Its performance is comparable to that of a wireline recognition system that uses the cepstrum as a feature set. Next, we extended the evaluation of the proposed bitstream-based front-end to large vocabulary speech recognition with a name database. The recognition results proved that the proposed bitstream-based front-end also gives a comparable performance to the conventional wireline front-end
Keywords :
cellular radio; digital radio; feature extraction; speech coding; speech recognition; IS-136 communications system; background noise; bitstream-based front-end; cellular radio; cepstrum; channel impairment; clean conditions; decoded speech signals; digital communication networks; feature extraction method; feature parameters; large vocabulary speech recognition; name database; performance; quantized spectral information; speaker-independent connected digit HMM recognition; speech coder; speech recognition system; string accuracy; voiced/unvoiced information; wireless speech recognition; wireline front-end; word accuracy; Background noise; Cepstrum; Data mining; Decoding; Digital communication; Feature extraction; Hidden Markov models; Spatial databases; Speech recognition; Vocabulary;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.928920
Filename :
928920
Link To Document :
بازگشت