Author :
Korten, Pim ; Jensen, Jesper ; Heusdens, Richard
Abstract :
Sinusoidal coding is an often employed technique in low bit-rate audio coding. Therefore, methods for efficient quantization of sinusoidal parameters are of great importance. In this paper, we use high-resolution assumptions to derive analytical expressions for the optimal entropy-constrained unrestricted spherical quantizers for the amplitude, phase, and frequency parameters of the sinusoidal model. This is done both for the case of a single sinusoid, and for the more practically relevant case of multiple sinusoids distributed across multiple segments. To account for psychoacoustical effects of the auditory system, a perceptual distortion measure is used. The optimal quantizers minimize a high-resolution approximation of the expected perceptual distortion, while the corresponding quantization indices satisfy an entropy constraint. The quantizers turn out to be flexible and of low complexity, in the sense that they can be determined easily for varying bit rate requirements, without any sort of retraining or iterative procedures. In an objective comparison it is shown that for the squared error distortion measure, the rate-distortion performance of the proposed method is very close to that of the theoretically optimal entropy-constrained vector quantization. Furthermore, for the perceptual distortion measure, the proposed scheme is shown to objectively outperform an existing sinusoidal quantization scheme, where frequency quantization is done independently. Finally, a subjective listening test, in which the proposed scheme is compared to an existing state-of-the-art sinusoidal quantization scheme with fixed quantizers for all input signals, indicates that the proposed scheme leads to an average bit rate reduction of 20%, at the same subjective quality level as the existing scheme
Keywords :
audio coding; quantisation (signal); auditory system; frequency quantization; high-resolution approximation; high-resolution spherical quantization; low bit-rate audio coding; optimal entropy-constrained unrestricted spherical quantizer; optimal entropy-constrained vector quantization; perceptual distortion measure; psychoacoustical effects; quantization indices; rate-distortion performance; sinusoidal coding; sinusoidal parameters; sinusoidal quantization scheme; squared error distortion measure; subjective listening test; Audio coding; Auditory system; Bit rate; Distortion measurement; Entropy; Frequency; Psychoacoustic models; Psychology; Quantization; Rate-distortion; High-resolution quantization; point density functions; sinusoidal coding; unrestricted spherical quantization;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on