Title :
A mixed sinusoidally excited linear prediction coder at 4 kb/s and below
Author :
Yeldener, Suat ; De Martin, Juan Carlos ; Viswanathan, V.
Author_Institution :
DSP Solutions R&D Center, Texas Instrum. Inc., Dallas, TX, USA
Abstract :
There is currently a great deal of interest in the development of speech coding algorithms capable of delivering toll quality at 4 kb/s and below. For synthesizing high quality speech, accurate representation of the voiced portions of speech is essential. For bit rates of 4 kb/s and below, conventional code excited linear prediction (CELP) may likely not provide the appropriate degree of periodicity. It has been shown that good quality low bit rate speech coding can be obtained by frequency domain techniques such as sinusoidal transform coding (STC), multi-band excitation (MBE), mixed excitation linear prediction (MELP), and multi-band LPC (MB-LPC) vocoders. In this paper, a speech coding algorithm based on an improved version of MB-LPC is presented. Main features of this algorithm include a multi-stage time/frequency pitch estimation and an improved mixed voicing representation. An efficient quantization scheme for the spectral amplitudes of the excitation, called formant weighted vector quantization, is also used. This improved coder, called mixed sinusoidally excited linear prediction (MSELP), yields an unquantized model with speech quality better than the 32 kb/s AD-PCM quality. Initial efforts towards a fully quantized 4 kb/s coder, although not yet successful in achieving the toll quality goal, have produced good output speech quality
Keywords :
linear predictive coding; parameter estimation; spectral analysis; speech coding; time-frequency analysis; vector quantisation; vocoders; 4 kbit/s; MSELP coder; efficient quantization scheme; formant weighted vector quantization; frequency domain techniques; high quality speech synthesis; low bit rate coding; mixed excitation linear prediction; mixed sinusoidally excited linear prediction coder; mixed voicing representation; multi-band LPC; multi-band excitation; multi-stage time/frequency pitch estimation; sinusoidal transform coding; spectral amplitudes; speech coding; toll quality; vocoders; Bit rate; Frequency domain analysis; Frequency estimation; Linear predictive coding; Predictive models; Speech coding; Speech synthesis; Transform coding; Vector quantization; Vocoders;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.675333