DocumentCode
2173487
Title
Multistream speaker diarization through Information Bottleneck system outputs combination
Author
Vijayasenan, Deepu ; Valente, Fabio ; Motlicek, Petr
Author_Institution
Idiap Res. Inst., Martigny, Switzerland
fYear
2011
fDate
22-27 May 2011
Firstpage
4420
Lastpage
4423
Abstract
Speaker diarization of meetings recorded with Multiple Distant Microphones makes extensive use of multiple feature streams like MFCC and Time Delay of Arrivals (TDOA). Typically the combination happens using separate models for each feature stream. This work investigates if the combination of multiple feature streams can happen through the combination of multiple diarization systems performed using those features. The paper extends the previously proposed Information Bottleneck method to handle the combination of several probabilistic diarization outputs. In contrast to the conventional model-based feature combination, this technique is referred as system-based combination. Furthermore the paper introduces an hybrid model-system combination. Experiments are run on data from the Rich Transcription campaigns and show that the system based combination largely outperforms the model based combination by 37% relative. The hybrid approaches improve by 10-20%. The analysis of errors shows that the improvements come from the recordings where the individual MFCC and TDOA systems provide very different performances.
Keywords
microphones; speaker recognition; time-of-arrival estimation; MFCC systems; TDOA systems; conventional model-based feature combination; information bottleneck system output combination; multiple distant microphones; multistream speaker diarization; rich transcription campaign; time delay of arrivals; Clustering algorithms; Computational modeling; Estimation; Hidden Markov models; Mel frequency cepstral coefficient; Mutual information; Speech; Feature combination; Information bottleneck principle; Speaker diarization; TDOA features; diarization system combination;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location
Prague
ISSN
1520-6149
Print_ISBN
978-1-4577-0538-0
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2011.5947334
Filename
5947334
Link To Document