DocumentCode
3017335
Title
Online meeting recognizer with multichannel speaker diarization
Author
Araki, Shoko ; Hori, Takaaki ; Fujimoto, Masakiyo ; Watanabe, Shinji ; Yoshioka, Takuya ; Nakatani, Tomohiro ; Nakamura, Atsushi
Author_Institution
NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
fYear
2010
fDate
7-10 Nov. 2010
Firstpage
1697
Lastpage
1701
Abstract
We present our newly developed real-time conversation analyzer for group meetings. The goal of the system is to estimate automatically “who speaks when and what” in an online manner. In our system, “who speaks when” information is first obtained by estimating the directions of arrival (DOAs) of signals. Then, “who speaks what” is estimated with our automatic speech recognition (ASR) system, after suppressing reverberation, background noise, and interference speakers´ voices. In this paper, we focus particularly on the speaker diarization (“who speaks when” estimation) method, and we show that the speaker diarization information helps the ASR to reduce insertion errors.
Keywords
direction-of-arrival estimation; signal denoising; speaker recognition; ASR system; DOA estimation; automatic speech recognition system; background noise; directions of arrival estimation; insertion error reduction; interference speaker voice suppression; multichannel speaker diarization; online meeting recognizer; real-time conversation analyzer; reverberation suppression; Adaptation model; Microphones; Noise; Speech; Speech enhancement; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Signals, Systems and Computers (ASILOMAR), 2010 Conference Record of the Forty Fourth Asilomar Conference on
Conference_Location
Pacific Grove, CA
ISSN
1058-6393
Print_ISBN
978-1-4244-9722-5
Type
conf
DOI
10.1109/ACSSC.2010.5757829
Filename
5757829
Link To Document