DocumentCode
3500720
Title
Spatial speech coding for multi-teleconferencing
Author
Phua, Kok Soon ; Gan, Woon Seng
Author_Institution
Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore
Volume
1
fYear
1999
fDate
1999
Firstpage
313
Abstract
This paper describes a structural model for the implementation of multichannel speech coding for teleconferencing with spatial audio reproduction. Multiple nonaural speech sources are synthesized into binaural sound to produce a more realistic videoconferencing environment. The activity information of the individual binaural speech, which is determined by the voice activity detection algorithm, is used to calculate two weighting factors prior to mixing. Furthermore, a third level of weight adjustment can be carried out by adjusting these weighting factors before applying to the individual voice source. A scheme to remove undesirable noise spikes is also introduced. Both channels are then coded using the G.723.1 speech codec individually
Keywords
sound reproduction; speech codecs; speech coding; teleconferencing; G.723.1 speech codec; binaural sound; cocktail party effect; multichannel speech coding; multiple nonaural speech sources; noise spikes removal; spatial audio reproduction; structural model; teleconferencing; videoconferencing environment; voice activity detection algorithm; weighting factors; Detection algorithms; Electronic mail; Gallium nitride; Pulse modulation; Speech codecs; Speech coding; Speech processing; Speech synthesis; Telecommunication standards; Teleconferencing;
fLanguage
English
Publisher
ieee
Conference_Titel
TENCON 99. Proceedings of the IEEE Region 10 Conference
Conference_Location
Cheju Island
Print_ISBN
0-7803-5739-6
Type
conf
DOI
10.1109/TENCON.1999.818413
Filename
818413
Link To Document