Title :
Approaches and applications of audio diarization
Author :
Reynolds, D.A. ; Torres-Carrasquillo, P.
Author_Institution :
Lincoln Lab., MIT, Lexington, MA, USA
Abstract :
Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources. These sources can include particular speakers, music, background noise sources, and other signal source/channel characteristics. Diarization has utility in making automatic transcripts more readable and in searching and indexing audio archives. In this paper, we provide an overview of current audio diarization approaches and discuss performance and potential applications. We outline the general framework of diarization systems and present the performance of current systems as measured in the DARPA EARS Rich Transcription Fall 2004 (RT-04F) speaker diarization evaluation. Lastly, we look at future challenges and directions for diarization research.
Keywords :
audio signal processing; signal classification; speech processing; audio archive indexing; audio archive searching; audio channel annotation; audio diarization; audio source categorization; automatic transcripts; background noise sources; meta-data; music; signal energy temporal region source determination; speaker speaker; speech detection; Acoustic noise; Audio recording; Bandwidth; Broadcasting; Data mining; Indexing; NIST; Speech analysis; Speech enhancement; Speech recognition;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
Print_ISBN :
0-7803-8874-7
DOI :
10.1109/ICASSP.2005.1416463