Multistream speaker diarization through Information Bottleneck system outputs combination

Author

Vijayasenan, Deepu ; Valente, Fabio ; Motlicek, Petr

Author_Institution

Idiap Res. Inst., Martigny, Switzerland

fYear

2011

fDate

22-27 May 2011

Firstpage

4420

Lastpage

4423

Abstract

Speaker diarization of meetings recorded with Multiple Distant Microphones makes extensive use of multiple feature streams like MFCC and Time Delay of Arrivals (TDOA). Typically the combination happens using separate models for each feature stream. This work investigates if the combination of multiple feature streams can happen through the combination of multiple diarization systems performed using those features. The paper extends the previously proposed Information Bottleneck method to handle the combination of several probabilistic diarization outputs. In contrast to the conventional model-based feature combination, this technique is referred as system-based combination. Furthermore the paper introduces an hybrid model-system combination. Experiments are run on data from the Rich Transcription campaigns and show that the system based combination largely outperforms the model based combination by 37% relative. The hybrid approaches improve by 10-20%. The analysis of errors shows that the improvements come from the recordings where the individual MFCC and TDOA systems provide very different performances.

Keywords

microphones; speaker recognition; time-of-arrival estimation; MFCC systems; TDOA systems; conventional model-based feature combination; information bottleneck system output combination; multiple distant microphones; multistream speaker diarization; rich transcription campaign; time delay of arrivals; Clustering algorithms; Computational modeling; Estimation; Hidden Markov models; Mel frequency cepstral coefficient; Mutual information; Speech; Feature combination; Information bottleneck principle; Speaker diarization; TDOA features; diarization system combination;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location

Prague

ISSN

1520-6149

Print_ISBN

978-1-4577-0538-0

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2011.5947334

Filename

5947334