DocumentCode :
3054705
Title :
A new model of LPC excitation for producing natural-sounding speech at low bit rates
Author :
Atal, Bishnu S. ; Remde, Joel R.
Author_Institution :
Bell Laboratories, Murray Hill, NJ, USA
Volume :
7
fYear :
1982
fDate :
30072
Firstpage :
614
Lastpage :
617
Abstract :
The excitation for LPC speech synthesis usually consists of two separate signals - a delta-function pulse once every pitch period for voiced speech and white noise for unvoiced speech. This manner of representing excitation requires that speech segments be classified accurately into voiced and unvoiced categories and the pitch period of voiced segments be known. It is now well recognized that such a rigid idealization of the vocal excitation is often responsible for the unnatural quality associated with synthesized speech. This paper describes a new approach to the excitation problem that does not require a priori knowledge of either the voiced-unvoiced decision or the pitch period. All classes of sounds are generated by exciting the LPC filter with a sequence of pulses; the amplitudes and locations of the pulses are determined using a non-iterative analysis-by-synthesis procedure. This procedure minimizes a perceptual-distance metric representing subjectively-important differences between the waveforms of the original and the synthetic speech signals. The distance metric takes account of the finite-frequency resolution as well as the differential sensitivity of the human ear to errors in the formant and inter-formant regions of the speech spectrum.
Keywords :
Bit rate; Filters; Humans; Linear predictive coding; Pulse generation; Signal resolution; Speech enhancement; Speech recognition; Speech synthesis; White noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '82.
Type :
conf
DOI :
10.1109/ICASSP.1982.1171649
Filename :
1171649
Link To Document :
بازگشت