DocumentCode :
3744910
Title :
Variational Bayesian PLDA for speaker diarization in the MGB challenge
Author :
Jes?s Villalba;Alfonso Ortega;Antonio Miguel;Eduardo Lleida
Author_Institution :
ViVoLab, Aragon Institute for Engineering Research (I3A), University of Zaragoza, Spain
fYear :
2015
Firstpage :
667
Lastpage :
674
Abstract :
This paper describes the ViVoLab speaker diarization system for the Multi-Genre Broadcast (MGB) Challenge at ASRU2015. The challenge data consisted of BBC TV programmes of different genres. Diarization followed a longitudinal setup, i.e., the speakers of the current episode had to be linked to the speakers in previous episodes of the same show. We propose a system based on the i-vector paradigm. After an initial segmentation step, we compute an i-vector per speech segment. Then, a generative model based on Bayesian PLDA clusters the speakers. In this model, the speaker labels are latent variables that we optimize by variational Bayes iterations. The number of speakers in each episode was decided by maximizing the variational lower bound. The system includes several phases of segment-merging and re-clustering. We re-compute i-vectors after each merging step, which reduces the i-vector uncertainty. This approach attained a DER around 30% in the development set.
Keywords :
"Bayes methods","Computational modeling","Speech","Mathematical model","TV","Merging","Annealing"
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop on
Type :
conf
DOI :
10.1109/ASRU.2015.7404860
Filename :
7404860
Link To Document :
بازگشت