DocumentCode :
3855506
Title :
Speaker overlap detection with prosodic features for speaker diarisation
Author :
M. Zelenák;J. Hernando
Author_Institution :
TALP Res. Center, Univ. Politec. de Catalunya, Barcelona, Spain
Volume :
6
Issue :
8
fYear :
2012
fDate :
10/1/2012 12:00:00 AM
Firstpage :
798
Lastpage :
804
Abstract :
The handling of overlapping speech in the context of speaker diarisation attracted in recent years the interest of the scientific community, since speaker overlap was identified as one of the factors degrading the performance of conventional diarisation systems. In this study, the authors are discussing the possibility of using long-term prosodic features for the detection of overlapping speech, which is subsequently employed in speaker diarisation to improve the baseline system. The most relevant subset from the set of candidate prosodic features is determined in two steps. First, a ranking according to minimal-redundancy-maximal-relevance criterion is obtained, and then a hill-climbing wrapper strategy is applied for determining the optimal number of prosodic features, which should accompany short-term spectral features for overlap detection. In experiments on the augmented multi-party interaction (AMI) meeting distant-channel data, the authors show that the addition of prosodic features decreased overlap detection error. Detected overlap segments were used in speaker diarisation to recover missed speech by assigning multiple speaker labels and to increase the purity of speaker clusters. Improvements of the baseline diarisation system are reported in both single- and multi-site data conditions. However, the extension of the diarisation system with TDOAs showed its incompatibility with the overlap exclusion technique.
Journal_Title :
IET Signal Processing
Publisher :
iet
ISSN :
1751-9675
Type :
jour
DOI :
10.1049/iet-spr.2011.0233
Filename :
6410956
Link To Document :
بازگشت