Title :
Time envelope vocoder, a new LP based coding strategy for use at bit rates of 2.4 kb/s and below
Author :
Atkinson, I.A. ; Kondoz, A.M. ; Evans, B.G.
Author_Institution :
Centre for Satellite Eng. Res., Surrey Univ., Guildford, UK
fDate :
2/1/1995 12:00:00 AM
Abstract :
This paper presents a linear prediction (LP) based vocoder employing a novel technique which ensures smooth evolution of the synthetic speech waveform. In this coder, speech waveforms are considered as having a `time envelope´, the shape of which contains important perceptual information. By ensuring that the time envelope of the synthetic speech closely matches that of the original, natural sounding synthetic speech can be produced. Envelope matching may be achieved using a new, low complexity analysis by synthesis loop at the decoder which determines the synthetic excitation energy. The advantage over more traditional linear prediction vocoders is that the amplitude time envelope is preserved in addition to the spectral envelope, allowing the rapid amplitude transitions associated with onsets to be retained in the synthetic speech, resulting in a more intelligible output. Simply controlling the overall energy of the synthetic excitation is not sufficient to accurately control the synthetic speech energy. Small changes in linear prediction or pitch parameters due to quantization, for example, can cause variations in the synthetic speech amplitude, especially from one pitch cycle to the next resulting in noisy synthetic speech. The inclusion of an analysis by synthesis loop at the decoder ensures that the synthetic speech amplitude is independent of variations in the pitch period and LP parameters. This paper presents a complete vocoder scheme using time envelope matching, including details of techniques such as parameter interpolation, excitation pulse shaping and pitch tracking which have proven necessary to produce natural sounding synthetic speech at 2.4 kb/s and below
Keywords :
interpolation; linear predictive coding; speech coding; speech intelligibility; speech processing; speech synthesis; vocoders; 24 kbit/s; LP parameters; amplitude time envelope; analysis by synthesis loop; bit rates; excitation pulse shaping; half rate source coding; linear prediction vocoders; linear predictive coding; natural sounding synthetic speech; noisy synthetic speech; parameter interpolation; pitch cycle; pitch parameters; pitch tracking; quantization; spectral envelope; speech intelligibility; synthetic excitation energy; synthetic speech waveform; time envelope matching; time envelope vocoder; Acoustic noise; Decoding; Interpolation; Noise level; Pulse shaping methods; Quantization; Shape; Speech analysis; Speech synthesis; Vocoders;
Journal_Title :
Selected Areas in Communications, IEEE Journal on