Title :
PDF optimized parametric vector quantization of speech line spectral frequencies
Author :
Subramaniam, Anand D. ; Rao, Bhaskar D.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of California, San Diego, CA, USA
fDate :
3/1/2003 12:00:00 AM
Abstract :
A computationally efficient, high quality, vector quantization scheme based on a parametric probability density function (PDF) is proposed. In this scheme, the observations are modeled as i.i.d realizations of a multivariate Gaussian mixture density. The mixture model parameters are efficiently estimated using the expectation maximization (EM) algorithm. A low complexity quantization scheme using transform coding and bit allocation techniques which allows for easy mapping from observation to quantized value is developed for both fixed rate and variable rate systems. An attractive feature of this method is that source encoding using the resultant codebook involves very few searches and its computational complexity is minimal and independent of the rate of the system. Furthermore, the proposed scheme is bit scalable and can switch seamlessly between a memoryless quantizer and a quantizer with memory. The usefulness of the approach is demonstrated for speech coding where Gaussian mixture models are used to model speech line spectral frequencies. The performance of the memoryless quantizer is 1-3 bits better than conventional quantization schemes.
Keywords :
Gaussian processes; computational complexity; optimisation; probability; source coding; spectral analysis; speech coding; transform coding; vector quantisation; EM algorithm; Gaussian mixture models; PDF optimized parametric VQ; PDF optimized parametric vector quantization; bit allocation; bit scalable scheme; codebook; computational complexity; computationally efficient VQ; expectation maximization algorithm; fixed rate systems; high quality VQ; i.i.d. observations; low complexity quantization; memory quantizer; memoryless quantizer; mixture model parameters; multivariate Gaussian mixture density; probability density function; source encoding; speech coding; speech line spectral frequencies; transform coding; variable rate systems; Bit rate; Computational complexity; Encoding; Frequency; Parameter estimation; Probability density function; Speech coding; Switches; Transform coding; Vector quantization;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
DOI :
10.1109/TSA.2003.809192