• DocumentCode
    2790863
  • Title

    Multistream speaker diarization beyond two acoustic feature streams

  • Author

    Vijayasenan, Deepu ; Valente, Fabio ; Bourlard, Herve

  • Author_Institution
    Idiap Res. Inst., Martigny, Switzerland
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    4950
  • Lastpage
    4953
  • Abstract
    Speaker diarization for meetings data are recently converging towards multistream systems. The most common complementary features used in combination with MFCC are Time Delay of Arrival (TDOA). Also other features have been proposed although, there are no reported improvements on top of MFCC+TDOA systems. In this work we investigate the combination of other feature sets along with MFCC+TDOA. We discuss issues and problems related to the weighting of four different streams proposing a solution based on a smoothed version of the speaker error. Experiments are presented on NIST RT06 meeting diarization evaluation. Results reveal that the combination of four acoustic feature streams results in a 30% relative improvement with respect to the MFCC+TDOA feature combination. To the authors´ best knowledge, this is the first successful attempt to improve the MFCC+TDOA baseline including other feature streams.
  • Keywords
    acoustic signal processing; acoustic streaming; cepstral analysis; delays; feature extraction; speaker recognition; time-of-arrival estimation; unsupervised learning; acoustic feature stream; mel frequency cepstral coefficient; multistream speaker diarization; multistream systems; time delay of arrival; Audio recording; Delay effects; Hidden Markov models; Loudspeakers; Mel frequency cepstral coefficient; Microphones; NIST; Speech; Streaming media; Unsupervised learning; Feature combination; Information bottleneck principle; Speaker diarization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495086
  • Filename
    5495086