• DocumentCode
    394224
  • Title

    Location based speaker segmentation

  • Author

    Lathoud, Guillaume ; McCowan, Iain A.

  • Author_Institution
    Dalle Molle Inst. for Perceptual Artificial Intelligence, Martigny, Switzerland
  • Volume
    1
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    The paper proposes a technique that segments audio according to speakers and based on their location. In many multi-party conversations, such as meetings, the location of participants is restricted to a small number of regions, such as seats around a table, or at a whiteboard. In such cases, segmentation according to these discrete regions would be a reliable means of determining speaker turns. We propose a system that uses microphone pair time delays as features to represent speaker locations. These features are integrated in a GMM/HMM framework to determine an optimal segmentation of the audio according to location. The HMM framework also allows extensions to recognise more complex structures, such as the presence of two simultaneous speakers. Experiments testing the system on real recordings from a meeting room show that the proposed location features can provide greater discrimination than standard cepstral features, and also demonstrate the success of an extension to handle dual-speaker overlap.
  • Keywords
    array signal processing; delays; hidden Markov models; microphones; speech processing; GMM; Gaussian mixture model; HMM; audio segmentation; dual-speaker overlap; hidden Markov model; microphone array processing; microphone pair time delays; multi-party conversations; speaker location; speaker segmentation; speech processing; Artificial intelligence; Cepstral analysis; Delay effects; Delay estimation; Disk recording; Hidden Markov models; Loudspeakers; Microphone arrays; Speech processing; System testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1198745
  • Filename
    1198745