Title :
Low Bit-Rate Speech Coding Through Quantization of Mel-Frequency Cepstral Coefficients
Author :
Boucheron, Laura E. ; De Leon, Phillip L. ; Sandoval, Steven
Author_Institution :
Klipsch Sch. of Electr. & Comput. Eng., New Mexico State Univ., Las Cruces, NM, USA
Abstract :
In this paper, we propose a low bit-rate speech codec based on vector quantization (VQ) of the mel-frequency cepstral coefficients (MFCCs). We begin by showing that if a high-resolution mel-frequency cepstrum (MFC) is computed, good-quality speech reconstruction is possible from the MFCCs despite the lack of phase information. By evaluating the contribution toward speech quality that individual MFCCs make and applying appropriate quantization, our results show that the MFCC-based codec exceeds the state-of-the-art MELPe codec across the entire range of 600-2400 bps, when evaluated with the perceptual evaluation of speech quality (PESQ) (ITU-T recommendation P.862). The main advantage of the proposed codec is in distributed speech recognition (DSR) since the MFCCs can be directly applied thus eliminating additional decode and feature extract stages; furthermore, the proposed codec better preserves the fidelity of MFCCs and better word accuracy rates as compared to CELP and MELPe codecs.
Keywords :
cepstral analysis; speech coding; speech recognition; vector quantisation; CELP codecs; MELPe codecs; distributed speech recognition; high resolution mel frequency cepstrum; low bit rate speech coding; mel frequency cepstral coefficients; vector quantization; Codecs; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech processing; Speech recognition; Speech analysis; speech coding;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2011.2162407