DocumentCode :
1720928
Title :
A DOA Based Speaker Diarization System for Real Meetings
Author :
Araki, Shoko ; Fujimoto, Masakiyo ; Ishizuka, Kentaro ; Sawada, Hiroshi ; Makino, Shoji
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kyoto
fYear :
2008
Firstpage :
29
Lastpage :
32
Abstract :
This paper presents a speaker diarization system that estimates who spoke when in a meeting. Our proposed system is realized by using a noise robust voice activity detector (VAD), a direction of arrival (DOA) estimator, and a DOA classifier. Our previous system utilized the generalized cross correlation method with the phase transform (GCC-PHAT) approach for the DOA estimation. Because the GCC-PHAT can estimate just one DOA per frame, it was difficult to handle speaker overlaps. This paper tries to deal with this issue by employing a DOA at each time-frequency slot (TFDOA), and reports how it improves diarization performance for real meetings / conversations recorded in a room with a reverberation time of 350 ms.
Keywords :
direction-of-arrival estimation; speaker recognition; time-frequency analysis; direction of arrival estimation; generalized cross correlation method; phase transform; real meeting; speaker diarization system; time 350 ms; time-frequency slot DOA estimation; voice activity detector; Correlation; Detectors; Direction of arrival estimation; Laboratories; Microphones; Noise robustness; Phase estimation; Reverberation; Speech; Time frequency analysis; diarization; direction of arrival; voice activity detector;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Hands-Free Speech Communication and Microphone Arrays, 2008. HSCMA 2008
Conference_Location :
Trento
Print_ISBN :
978-1-4244-2337-8
Electronic_ISBN :
978-1-4244-2338-5
Type :
conf
DOI :
10.1109/HSCMA.2008.4538680
Filename :
4538680
Link To Document :
بازگشت