DocumentCode :
2173487
Title :
Multistream speaker diarization through Information Bottleneck system outputs combination
Author :
Vijayasenan, Deepu ; Valente, Fabio ; Motlicek, Petr
Author_Institution :
Idiap Res. Inst., Martigny, Switzerland
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
4420
Lastpage :
4423
Abstract :
Speaker diarization of meetings recorded with Multiple Distant Microphones makes extensive use of multiple feature streams like MFCC and Time Delay of Arrivals (TDOA). Typically the combination happens using separate models for each feature stream. This work investigates if the combination of multiple feature streams can happen through the combination of multiple diarization systems performed using those features. The paper extends the previously proposed Information Bottleneck method to handle the combination of several probabilistic diarization outputs. In contrast to the conventional model-based feature combination, this technique is referred as system-based combination. Furthermore the paper introduces an hybrid model-system combination. Experiments are run on data from the Rich Transcription campaigns and show that the system based combination largely outperforms the model based combination by 37% relative. The hybrid approaches improve by 10-20%. The analysis of errors shows that the improvements come from the recordings where the individual MFCC and TDOA systems provide very different performances.
Keywords :
microphones; speaker recognition; time-of-arrival estimation; MFCC systems; TDOA systems; conventional model-based feature combination; information bottleneck system output combination; multiple distant microphones; multistream speaker diarization; rich transcription campaign; time delay of arrivals; Clustering algorithms; Computational modeling; Estimation; Hidden Markov models; Mel frequency cepstral coefficient; Mutual information; Speech; Feature combination; Information bottleneck principle; Speaker diarization; TDOA features; diarization system combination;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947334
Filename :
5947334
Link To Document :
بازگشت