DocumentCode
2436097
Title
Detection of Speaker Change Points in Conversational Speech
Author
Carlin, Michael A. ; Smolenski, Brett Y
Author_Institution
Air Force Res. Lab. / IFEC, Rome
fYear
2007
fDate
3-10 March 2007
Firstpage
1
Lastpage
8
Abstract
An important preprocessing step in many automatic speech segmentation and speaker clustering systems is the accurate detection of speaker change points, the times when one speaker stops talking and another begins. However, this becomes very difficult in conversational speech since utterance lengths can be extremely short, speaker changes occur frequently, speakers may talk over one another (co-channel interference), and the recording environment and/or communication channel is sub-optimal or degraded. Modern aviation systems can benefit from this research as a pre-processing stage in a variety of applications. Examples include automatic segmentation and clustering of pilot/air traffic controller communications, detection of a third or unauthorized speaker in commercial airline cockpits, and automatic transcription of cockpit audio recordings. This research presents an approach to detecting speaker change points using information obtained from voiced speech segments. This permits taking advantage of the facts that (1) speaker starting and stopping information should be contained between segments of voiced speech and (2) voiced speech contains the most useful speaker identifiable information. The technique presented here shows promise as an enhancement to currently available change point detection algorithms.
Keywords
aircraft; pattern clustering; speaker recognition; automatic speech segmentation; automatic transcription; aviation systems; co-channel interference; cockpit audio recordings; commercial airline cockpits; communication channel; conversational speech; pilot-air traffic controller communications; recording environment; speaker change point detection; speaker clustering systems; speaker identifiable information; starting information; stopping information; unauthorized speaker detection; utterance lengths; voiced speech segments; Air traffic control; Audio recording; Change detection algorithms; Communication channels; Degradation; Humans; Interchannel interference; Laboratories; Loudspeakers; Speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Aerospace Conference, 2007 IEEE
Conference_Location
Big Sky, MT
ISSN
1095-323X
Print_ISBN
1-4244-0524-6
Electronic_ISBN
1095-323X
Type
conf
DOI
10.1109/AERO.2007.352978
Filename
4161418
Link To Document