Title :
A psychoacoustic-based analysis-by-synthesis scheme for jointly encoding multiple audio objects into independent mixtures
Author :
Xiguang Zheng ; Ritz, Christian ; Jiangtao Xi
Author_Institution :
Sch. of Electr. Comput. & Telecommun. Eng., Univ. of Wollongong, Wollongong, NSW, Australia
Abstract :
Perceptually accurate representation of audio objects obtained from multi-track audio signals is desired for applications such as interactive soundfield rendering and browsing. Presented in this work is a scalable psychoacoustic analysis-by-synthesis approach to extract the perceptually dominant time-frequency audio objects from a multi-track audio signal. The proposed compression framework exploits sparsity in the perceptual time-frequency domain where up to eight audio objects can be efficiently encoded using only two audio mixtures with side information representing the origin of the time-frequency instances in the mixture signals. The proposed approach, judged by both objective and subjective tests, results in superior audio quality compared to existing techniques when encoding more than 5 audio objects.
Keywords :
acoustic field; audio coding; compressed sensing; signal representation; signal synthesis; audio object representation; compression framework; independent mixture; interactive soundfield browsing; interactive soundfield rendering; multiple audio object encoding; multitrack audio signals; psychoacoustic-based analysis-by-synthesis scheme; superior audio quality; time-frequency audio objects; Encoding; Indexes; Psychoacoustics; Source separation; Speech; Time-frequency analysis; Audio object coding; Multichannel audio compression; Soundfield navigation;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6637653