DocumentCode :
1275835
Title :
Low Bit-Rate Speech Coding Through Quantization of Mel-Frequency Cepstral Coefficients
Author :
Boucheron, Laura E. ; De Leon, Phillip L. ; Sandoval, Steven
Author_Institution :
Klipsch Sch. of Electr. & Comput. Eng., New Mexico State Univ., Las Cruces, NM, USA
Volume :
20
Issue :
2
fYear :
2012
Firstpage :
610
Lastpage :
619
Abstract :
In this paper, we propose a low bit-rate speech codec based on vector quantization (VQ) of the mel-frequency cepstral coefficients (MFCCs). We begin by showing that if a high-resolution mel-frequency cepstrum (MFC) is computed, good-quality speech reconstruction is possible from the MFCCs despite the lack of phase information. By evaluating the contribution toward speech quality that individual MFCCs make and applying appropriate quantization, our results show that the MFCC-based codec exceeds the state-of-the-art MELPe codec across the entire range of 600-2400 bps, when evaluated with the perceptual evaluation of speech quality (PESQ) (ITU-T recommendation P.862). The main advantage of the proposed codec is in distributed speech recognition (DSR) since the MFCCs can be directly applied thus eliminating additional decode and feature extract stages; furthermore, the proposed codec better preserves the fidelity of MFCCs and better word accuracy rates as compared to CELP and MELPe codecs.
Keywords :
cepstral analysis; speech coding; speech recognition; vector quantisation; CELP codecs; MELPe codecs; distributed speech recognition; high resolution mel frequency cepstrum; low bit rate speech coding; mel frequency cepstral coefficients; vector quantization; Codecs; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech processing; Speech recognition; Speech analysis; speech coding;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2011.2162407
Filename :
5957263
Link To Document :
بازگشت