DocumentCode
2173546
Title
Fast speaker diarization based on binary keys
Author
Anguera, Xavier ; Bonastre, Jean-François
Author_Institution
Telefonica Res., Barcelona, Spain
fYear
2011
fDate
22-27 May 2011
Firstpage
4428
Lastpage
4431
Abstract
Splitting a speech signal into speakers is the main goal of a speaker diarization system, which has become an important building block in many speech processing algorithms. Current state of the art systems are able to obtain good diarization error rates, but most of them are rather slow, which is a strong handicap in applications that require overall faster than real-time processing. In this paper we present a novel speaker diarization system which is built following a bottom-up agglomerative clustering approach and based on speaker binary keys, recently proposed for speaker modeling. After initialization, processing is entirely done over binary vectors and using exclusively binary metrics, which makes the system very fast. On tests performed using all conference meetings datasets released for the NIST RT evaluation campaigns we achieve diarization error rates just slightly worse than a classic acoustic-based system while running over 10 times faster.
Keywords
speech processing; NIST RT evaluation; agglomerative clustering approach; binary keys; diarization error rates; fast speaker diarization; speaker binary keys; speaker diarization system; speech signal; Acoustics; Clustering algorithms; Computational modeling; Density estimation robust algorithm; Real time systems; Speech; Training; binary; discrete; discriminant; rich transcription; speaker diarization;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location
Prague
ISSN
1520-6149
Print_ISBN
978-1-4577-0538-0
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2011.5947336
Filename
5947336
Link To Document