• DocumentCode
    3017335
  • Title

    Online meeting recognizer with multichannel speaker diarization

  • Author

    Araki, Shoko ; Hori, Takaaki ; Fujimoto, Masakiyo ; Watanabe, Shinji ; Yoshioka, Takuya ; Nakatani, Tomohiro ; Nakamura, Atsushi

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
  • fYear
    2010
  • fDate
    7-10 Nov. 2010
  • Firstpage
    1697
  • Lastpage
    1701
  • Abstract
    We present our newly developed real-time conversation analyzer for group meetings. The goal of the system is to estimate automatically “who speaks when and what” in an online manner. In our system, “who speaks when” information is first obtained by estimating the directions of arrival (DOAs) of signals. Then, “who speaks what” is estimated with our automatic speech recognition (ASR) system, after suppressing reverberation, background noise, and interference speakers´ voices. In this paper, we focus particularly on the speaker diarization (“who speaks when” estimation) method, and we show that the speaker diarization information helps the ASR to reduce insertion errors.
  • Keywords
    direction-of-arrival estimation; signal denoising; speaker recognition; ASR system; DOA estimation; automatic speech recognition system; background noise; directions of arrival estimation; insertion error reduction; interference speaker voice suppression; multichannel speaker diarization; online meeting recognizer; real-time conversation analyzer; reverberation suppression; Adaptation model; Microphones; Noise; Speech; Speech enhancement; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signals, Systems and Computers (ASILOMAR), 2010 Conference Record of the Forty Fourth Asilomar Conference on
  • Conference_Location
    Pacific Grove, CA
  • ISSN
    1058-6393
  • Print_ISBN
    978-1-4244-9722-5
  • Type

    conf

  • DOI
    10.1109/ACSSC.2010.5757829
  • Filename
    5757829