Online meeting recognizer with multichannel speaker diarization

Author

Araki, Shoko ; Hori, Takaaki ; Fujimoto, Masakiyo ; Watanabe, Shinji ; Yoshioka, Takuya ; Nakatani, Tomohiro ; Nakamura, Atsushi

Author_Institution

NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan

fYear

2010

fDate

7-10 Nov. 2010

Firstpage

1697

Lastpage

1701

Abstract

We present our newly developed real-time conversation analyzer for group meetings. The goal of the system is to estimate automatically “who speaks when and what” in an online manner. In our system, “who speaks when” information is first obtained by estimating the directions of arrival (DOAs) of signals. Then, “who speaks what” is estimated with our automatic speech recognition (ASR) system, after suppressing reverberation, background noise, and interference speakers´ voices. In this paper, we focus particularly on the speaker diarization (“who speaks when” estimation) method, and we show that the speaker diarization information helps the ASR to reduce insertion errors.

Keywords

direction-of-arrival estimation; signal denoising; speaker recognition; ASR system; DOA estimation; automatic speech recognition system; background noise; directions of arrival estimation; insertion error reduction; interference speaker voice suppression; multichannel speaker diarization; online meeting recognizer; real-time conversation analyzer; reverberation suppression; Adaptation model; Microphones; Noise; Speech; Speech enhancement; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Signals, Systems and Computers (ASILOMAR), 2010 Conference Record of the Forty Fourth Asilomar Conference on

Conference_Location

Pacific Grove, CA

ISSN

1058-6393

Print_ISBN

978-1-4244-9722-5

Type

conf

DOI

10.1109/ACSSC.2010.5757829

Filename

5757829