• DocumentCode
    2436097
  • Title

    Detection of Speaker Change Points in Conversational Speech

  • Author

    Carlin, Michael A. ; Smolenski, Brett Y

  • Author_Institution
    Air Force Res. Lab. / IFEC, Rome
  • fYear
    2007
  • fDate
    3-10 March 2007
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    An important preprocessing step in many automatic speech segmentation and speaker clustering systems is the accurate detection of speaker change points, the times when one speaker stops talking and another begins. However, this becomes very difficult in conversational speech since utterance lengths can be extremely short, speaker changes occur frequently, speakers may talk over one another (co-channel interference), and the recording environment and/or communication channel is sub-optimal or degraded. Modern aviation systems can benefit from this research as a pre-processing stage in a variety of applications. Examples include automatic segmentation and clustering of pilot/air traffic controller communications, detection of a third or unauthorized speaker in commercial airline cockpits, and automatic transcription of cockpit audio recordings. This research presents an approach to detecting speaker change points using information obtained from voiced speech segments. This permits taking advantage of the facts that (1) speaker starting and stopping information should be contained between segments of voiced speech and (2) voiced speech contains the most useful speaker identifiable information. The technique presented here shows promise as an enhancement to currently available change point detection algorithms.
  • Keywords
    aircraft; pattern clustering; speaker recognition; automatic speech segmentation; automatic transcription; aviation systems; co-channel interference; cockpit audio recordings; commercial airline cockpits; communication channel; conversational speech; pilot-air traffic controller communications; recording environment; speaker change point detection; speaker clustering systems; speaker identifiable information; starting information; stopping information; unauthorized speaker detection; utterance lengths; voiced speech segments; Air traffic control; Audio recording; Change detection algorithms; Communication channels; Degradation; Humans; Interchannel interference; Laboratories; Loudspeakers; Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Aerospace Conference, 2007 IEEE
  • Conference_Location
    Big Sky, MT
  • ISSN
    1095-323X
  • Print_ISBN
    1-4244-0524-6
  • Electronic_ISBN
    1095-323X
  • Type

    conf

  • DOI
    10.1109/AERO.2007.352978
  • Filename
    4161418