DocumentCode
846074
Title
Binaural cue coding-Part I: psychoacoustic fundamentals and design principles
Author
Baumgarte, Frank ; Faller, Christof
Author_Institution
Media Signal Process. Res. Dept., Agere Syst., Allentown, PA, USA
Volume
11
Issue
6
fYear
2003
Firstpage
509
Lastpage
519
Abstract
Binaural Cue Coding (BCC) is a method for multichannel spatial rendering based on one down-mixed audio channel and BCC side information. The BCC side information has a low data rate and it is derived from the multichannel encoder input signal. A natural application of BCC is multichannel audio data rate reduction since only a single down-mixed audio channel needs to be transmitted. An alternative BCC scheme for efficient joint transmission of independent source signals supports flexible spatial rendering at the decoder. This paper (Part I) discusses the most relevant binaural perception phenomena exploited by BCC. Based on that, it presents a psychoacoustically motivated approach for designing a BCC analyzer and synthesizer. This leads to a reference implementation for analysis and synthesis of stereophonic audio signals based on a Cochlear Filter Bank. BCC synthesizer implementations based on the FFT are presented as low-complexity alternatives. A subjective audio quality assessment of these implementations shows the robust performance of BCC for critical speech and audio material. Moreover, the results suggest that the performance given by the reference synthesizer is not significantly compromised when using a low-complexity FFT-based synthesizer. The companion paper (Part II) generalizes BCC analysis and synthesis for multichannel audio and proposes complete BCC schemes including quantization and coding. Part II also describes an alternative BCC scheme with flexible rendering capability at the decoder and proposes several applications for both BCC schemes.
Keywords
audio coding; channel bank filters; fast Fourier transforms; hearing; speech codecs; FFT-based synthesizer; audio channel; audio coding; auditory filter bank; auditory scene synthesis; binaural cue coding; binaural perception phenomena; binaural source localization; cochlear filter bank; fast Fourier transform; flexible spatial rendering; multichannel encoder; multichannel spatial rendering; psychoacoustic design; psychoacoustic fundamentals; quantization; stereophonic audio signal; Decoding; Filter bank; Psychology; Quality assessment; Quantization; Robustness; Signal analysis; Signal synthesis; Speech synthesis; Synthesizers;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/TSA.2003.818109
Filename
1255439
Link To Document