A mixed sinusoidally excited linear prediction coder at 4 kb/s and below

Author

Yeldener, Suat ; De Martin, Juan Carlos ; Viswanathan, V.

Author_Institution

DSP Solutions R&D Center, Texas Instrum. Inc., Dallas, TX, USA

Volume

2

fYear

1998

fDate

12-15 May 1998

Firstpage

589

Abstract

There is currently a great deal of interest in the development of speech coding algorithms capable of delivering toll quality at 4 kb/s and below. For synthesizing high quality speech, accurate representation of the voiced portions of speech is essential. For bit rates of 4 kb/s and below, conventional code excited linear prediction (CELP) may likely not provide the appropriate degree of periodicity. It has been shown that good quality low bit rate speech coding can be obtained by frequency domain techniques such as sinusoidal transform coding (STC), multi-band excitation (MBE), mixed excitation linear prediction (MELP), and multi-band LPC (MB-LPC) vocoders. In this paper, a speech coding algorithm based on an improved version of MB-LPC is presented. Main features of this algorithm include a multi-stage time/frequency pitch estimation and an improved mixed voicing representation. An efficient quantization scheme for the spectral amplitudes of the excitation, called formant weighted vector quantization, is also used. This improved coder, called mixed sinusoidally excited linear prediction (MSELP), yields an unquantized model with speech quality better than the 32 kb/s AD-PCM quality. Initial efforts towards a fully quantized 4 kb/s coder, although not yet successful in achieving the toll quality goal, have produced good output speech quality

Keywords

linear predictive coding; parameter estimation; spectral analysis; speech coding; time-frequency analysis; vector quantisation; vocoders; 4 kbit/s; MSELP coder; efficient quantization scheme; formant weighted vector quantization; frequency domain techniques; high quality speech synthesis; low bit rate coding; mixed excitation linear prediction; mixed sinusoidally excited linear prediction coder; mixed voicing representation; multi-band LPC; multi-band excitation; multi-stage time/frequency pitch estimation; sinusoidal transform coding; spectral amplitudes; speech coding; toll quality; vocoders; Bit rate; Frequency domain analysis; Frequency estimation; Linear predictive coding; Predictive models; Speech coding; Speech synthesis; Transform coding; Vector quantization; Vocoders;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on

Conference_Location

Seattle, WA

ISSN

1520-6149

Print_ISBN

0-7803-4428-6

Type

conf

DOI

10.1109/ICASSP.1998.675333

Filename

675333