DocumentCode :
3500720
Title :
Spatial speech coding for multi-teleconferencing
Author :
Phua, Kok Soon ; Gan, Woon Seng
Author_Institution :
Sch. of Electr. & Electron. Eng., Nanyang Technol. Univ., Singapore
Volume :
1
fYear :
1999
fDate :
1999
Firstpage :
313
Abstract :
This paper describes a structural model for the implementation of multichannel speech coding for teleconferencing with spatial audio reproduction. Multiple nonaural speech sources are synthesized into binaural sound to produce a more realistic videoconferencing environment. The activity information of the individual binaural speech, which is determined by the voice activity detection algorithm, is used to calculate two weighting factors prior to mixing. Furthermore, a third level of weight adjustment can be carried out by adjusting these weighting factors before applying to the individual voice source. A scheme to remove undesirable noise spikes is also introduced. Both channels are then coded using the G.723.1 speech codec individually
Keywords :
sound reproduction; speech codecs; speech coding; teleconferencing; G.723.1 speech codec; binaural sound; cocktail party effect; multichannel speech coding; multiple nonaural speech sources; noise spikes removal; spatial audio reproduction; structural model; teleconferencing; videoconferencing environment; voice activity detection algorithm; weighting factors; Detection algorithms; Electronic mail; Gallium nitride; Pulse modulation; Speech codecs; Speech coding; Speech processing; Speech synthesis; Telecommunication standards; Teleconferencing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
TENCON 99. Proceedings of the IEEE Region 10 Conference
Conference_Location :
Cheju Island
Print_ISBN :
0-7803-5739-6
Type :
conf
DOI :
10.1109/TENCON.1999.818413
Filename :
818413
Link To Document :
بازگشت