Title :
Improving speaker diarization using social role information
Author :
Sapru, Ashtosh ; Yella, Sree Harsha ; Bourlard, Herve
Author_Institution :
Idiap Res. Inst., Martigny, Switzerland
Abstract :
Speaker diarization systems for meetings commonly model acoustic and spatial information, ignoring that meetings are instances of human interactions. Recent studies have shown that social roles influence the interaction patterns of speakers. This paper proposes a novel method to integrate social roles information in the speaker diarization framework. First, we modify the minimum duration constraint in baseline diarization system by using role information to model the expected duration of speaker´s turn. Furthermore, we also propose a social role n-gram model as prior information on speaker interaction patterns. The proposed method is integrated in the state-of-the-art diarization system to reduce the speaker error. Experiments are performed on AMI corpus which is annotated in terms of social roles. The proposed method reduces the speaker error by 16% relative to baseline HMM-GMM system. Furthermore, the paper also investigates the performance of the proposed method on other meeting scenarios like those from NIST Rich Transcription campaigns. Experiments on Rich Transcription meetings reveal that speaker error can be reduced by 13% relative to the baseline system, thus demonstrating the potential of the proposed method.
Keywords :
Gaussian processes; hidden Markov models; mixture models; speech processing; AMI corpus; Gaussian mixture modeling; NIST Rich Transcription campaigns; Rich Transcription meetings; acoustic information; baseline HMM-GMM system; baseline diarization system; hidden Markov model; minimum duration constraint; social role information; social role n-gram model; spatial information; speaker diarization; speaker interaction patterns; speaker turn expected duration; Acoustics; Feature extraction; Hidden Markov models; Histograms; NIST; Speech; Speech processing; HMM-GMM; Social Roles; Speaker diarization;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6853566