DocumentCode :
2043502
Title :
“ISI” a new method for automatic speaker tracking and detection
Author :
Ouamour, S. ; Guerti, M. ; Sayoud, H.
Author_Institution :
Electron. Inst., USTHB, Algiers, Algeria
fYear :
2006
fDate :
20-22 March 2006
Firstpage :
1
Lastpage :
5
Abstract :
In this paper we propose a new algorithm called ISI or “Interlaced Speech Indexing”, developed and implemented for the task of speaker detection and tracking. It consists in finding the identity of a well-defined speaker and the moments of his interventions inside an audio document, in order to access rapidly, directly and easily to his speech. Speaker Tracking can broadly be divided into two problems: Locating the points of speaker change (Segmentation of the document) and looking for the target speaker in each segment using a verification system in order to extract his global speech in the document: Speaker Detection. For the segmentation task, we developed a method based on an interlaced equidistant segmentation (IES) associated with the ISI algorithm. This approach uses a speaker identification method based on Second Order Statistical Measures (SOSM). As SOSM measures, we choose the “μGc” one, which is based on the covariance matrix. However, the experiments showed that this method needs, at least, a speech length of 2 seconds, which means that the segmentation resolution will be 2 seconds. By combining the SOSM with the new Indexing technique (ISI), we demonstrate that the average segmentation error is reduced to only 0.5 second, which is more accurate and more interesting for real-time applications. Results indicate that the association SOSM-ISI provides a high resolution and a high tracking performance: the tracking score (percentage of correctly labelled segments) is 95% on TIMIT database and 92.4% on Hub4 database.
Keywords :
covariance matrices; signal detection; speaker recognition; speech processing; statistical analysis; ISI algorithm; SOSM; automatic speaker detection; automatic speaker tracking; covariance matrix; interlaced equidistant segmentation; interlaced speech indexing; second order statistical measure; speaker identification; speaker verification system; Indexing; Labeling; Noise; Speech; Speech processing; Target tracking; Segmentation; Speaker tracking; Speech processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
GCC Conference (GCC), 2006 IEEE
Conference_Location :
Manama
Print_ISBN :
978-0-7803-9590-9
Electronic_ISBN :
978-0-7803-9591-6
Type :
conf
DOI :
10.1109/IEEEGCC.2006.5686248
Filename :
5686248
Link To Document :
بازگشت