Title :
Pure segment selection as speaker diarization post-processing
Author :
Ben-Harush, Oshry ; Guterman, Hugo ; Lapidot, Itshak
Author_Institution :
Ben-Gurion Univ. of the Negev, Beer-Sheva, Israel
Abstract :
Audio diarization is the process of assigning audio channel temporal segments to the appropriate generating source according to specific acoustic properties. Sources can be speech, music, background noise etc. Speaker diarization systems confronts the problem of segmentation and labeling of a conversation while no prior knowledge on the speakers is available. As human expert segmentation is time and money consuming; it is worthwhile to develop an automatic diarization system as a replacement to human expert segmentation for speaker recognition applications. However, diarization systems has more false detected segments than can be allowed for speaker model training. This work focuses on the reduction of the false detected segments and in the selection of "pure" segments which contains only the required speaker data. For this purpose a measure of "purity" and the methodology for the extraction of the "pure" segments are required. In this paper a pure segments selection algorithm employing an expert system decision is presented. The proposed system is based on majority vote and normalized maximum likelihood of the segments. The pure segments selection algorithm relies on the accuracy of the diarization system which is based on Self Organizing Maps (SOM) as speaker models. One hundred and eight conversations from LDC America Call Home database are used for evaluation. The proposed approach shows a DER improvement of 29% relative to the DER achieved by the original diarization system.
Keywords :
expert systems; hidden Markov models; maximum likelihood estimation; self-organising feature maps; speaker recognition; LDC America Call Home database; acoustic property; audio channel temporal segment assignment; audio diarization; automatic speaker diarization system; diarization error rate; expert system decision; hidden Markov model; human expert segmentation; majority vote; normalized maximum likelihood; pure segment selection algorithm; self-organizing map; speaker diarization post-processing; speaker model training; speaker recognition; Background noise; Data mining; Density estimation robust algorithm; Expert systems; Humans; Labeling; Loudspeakers; Speaker recognition; Speech enhancement; Voting; Diarization; HMM; K-Means; SOM;
Conference_Titel :
Electrical and Electronics Engineers in Israel, 2008. IEEEI 2008. IEEE 25th Convention of
Conference_Location :
Eilat
Print_ISBN :
978-1-4244-2481-8
Electronic_ISBN :
978-1-4244-2482-5
DOI :
10.1109/EEEI.2008.4736571