Title :
A High-Rate Optimal Transform Coder With Gaussian Mixture Companders
Author :
Duni, Ethan R. ; Rao, Bhaskar D.
Author_Institution :
Dept. of Electr. & Comput. Eng., California Univ., San Diego, La Jolla, CA
fDate :
3/1/2007 12:00:00 AM
Abstract :
This paper examines the problem of designing fixed-rate transform coders for sources whose distributions are unknown and presumably non-Gaussian, under input-weighted squared error distortion measures. As a component of this system, a flexible scalar compander based on Gaussian mixtures is proposed. The high-rate analysis of transform coders is reviewed, and extended to the case of input-weighted squared error. An algorithm is developed to set the parameters of the system using a data-driven technique that automatically balances the source statistics, distortion measure, and structure of the transform coder to minimize the high-rate distortion. The implementation of Gaussian mixture companders is explored, resulting in a flexible, low-complexity scalar quantizer. Additionally, modifications to the system for operation at moderate rates, using unstructured scalar quantizers, are presented. The operation of the system for the problem of wideband speech line spectral frequencies (LSF) quantization with log spectral distortion is illustrated, and shown to provide good performance with very low complexity
Keywords :
Gaussian processes; compandors; speech coding; transform coding; Gaussian mixture companders; fixed-rate transform coders; flexible low-complexity scalar quantizer; flexible scalar compander; high-rate optimal transform coder; input-weighted squared error distortion measures; log spectral distortion; source statistics; wideband speech line spectral frequencies quantization; Distortion measurement; Frequency; Karhunen-Loeve transforms; Nonlinear distortion; Parameter estimation; Quantization; Speech; Statistics; Transform coding; Wideband; Compander; high-rate quantization; log spectral distortion (LSD); parameter estimation; transform coder; wideband speech;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2006.885905