Title :
Speaker Diarization Features: The UPM Contribution to the RT09 Evaluation
Author :
Pardo, José M. ; Barra-Chicote, Roberto ; San-Segundo, Rubén ; De Córdoba, Ricardo ; Martínez-González, Beatriz
Author_Institution :
Speech Technol. Group, Univ. Politec. de Madrid, Madrid, Spain
Abstract :
Two new features have been proposed and used in the Rich Transcription Evaluation 2009 by the Universidad Politécnica de Madrid, which outperform the results of the baseline system. One of the features is the intensity channel contribution, a feature related to the location of the speaker. The second feature is the logarithm of the interpolated fundamental frequency. It is the first time that both features are applied to the clustering stage of multiple distant microphone meetings diarization. It is shown that the inclusion of both features improves the baseline results by 15.36% and 16.71% relative to the development set and the RT 09 set, respectively. If we consider speaker errors only, the relative improvement is 23% and 32.83% on the development set and the RT09 set, respectively.
Keywords :
feature extraction; interpolation; microphones; pattern clustering; speaker recognition; RT09 evaluation; Rich Transcription Evaluation 2009; UPM contribution; intensity channel contribution; interpolated fundamental frequency logarithm; multiple distant microphone meetings diarization; speaker diarization features; speaker errors; speaker location features; speaker segmentation; Density estimation robust algorithm; Estimation; Feature extraction; Mel frequency cepstral coefficient; Merging; Microphones; Speech; Features for speaker diarization; speaker diarization; speaker segmentation; speech processing in meetings;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2011.2159971