DocumentCode
921775
Title
On 450-600 b/s natural sounding speech coding
Author
Cheng, Yan-Ming ; O´Shaughnessy, Douglas
Author_Institution
INRS Telecommun., Verdun, Que., Canada
Volume
1
Issue
2
fYear
1993
fDate
4/1/1993 12:00:00 AM
Firstpage
207
Lastpage
220
Abstract
Algorithms for encoding speech with good intelligence and naturalness at very low rates are studied. Naturalness is retained by encoding accurately the speech excitation information from an LPC (linear predictive coding) model. A glottal ARX (autoregressive with exogenous input) technique is used to model the speech signal for high quality. A large reduction in coding rate is achieved through short-term temporal compression of the speech and vector quantization. Application of traditional vector quantization to the temporal decomposition output is discussed, with consideration of distortion measures and codebook generation. Based on properties of short-term temporal decomposition, finite-state vector quantization is introduced to further decrease the coding rate. A problem associated with this technique, estimation of a state transition matrix with incomplete data, is treated. The general result is that practical coders operating in a range of 450-600 b/s with a delay of about 200 ms and natural-sounding output speech can be designed
Keywords
linear predictive coding; speech coding; vector quantisation; 450 to 600 bit/s; LPC model; autoregressive with exogenous input; codebook generation; coding rate; distortion measures; finite-state vector quantization; glottal ARX technique; linear predictive coding; natural-sounding output speech; short-term temporal compression; speech coding; speech compression; state transition matrix; vector quantization; very low bit rate; Bit rate; Distortion measurement; Encoding; Linear predictive coding; Speech analysis; Speech coding; Speech synthesis; Testing; Time measurement; Vector quantization;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.222879
Filename
222879
Link To Document