• DocumentCode
    2173487
  • Title

    Multistream speaker diarization through Information Bottleneck system outputs combination

  • Author

    Vijayasenan, Deepu ; Valente, Fabio ; Motlicek, Petr

  • Author_Institution
    Idiap Res. Inst., Martigny, Switzerland
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    4420
  • Lastpage
    4423
  • Abstract
    Speaker diarization of meetings recorded with Multiple Distant Microphones makes extensive use of multiple feature streams like MFCC and Time Delay of Arrivals (TDOA). Typically the combination happens using separate models for each feature stream. This work investigates if the combination of multiple feature streams can happen through the combination of multiple diarization systems performed using those features. The paper extends the previously proposed Information Bottleneck method to handle the combination of several probabilistic diarization outputs. In contrast to the conventional model-based feature combination, this technique is referred as system-based combination. Furthermore the paper introduces an hybrid model-system combination. Experiments are run on data from the Rich Transcription campaigns and show that the system based combination largely outperforms the model based combination by 37% relative. The hybrid approaches improve by 10-20%. The analysis of errors shows that the improvements come from the recordings where the individual MFCC and TDOA systems provide very different performances.
  • Keywords
    microphones; speaker recognition; time-of-arrival estimation; MFCC systems; TDOA systems; conventional model-based feature combination; information bottleneck system output combination; multiple distant microphones; multistream speaker diarization; rich transcription campaign; time delay of arrivals; Clustering algorithms; Computational modeling; Estimation; Hidden Markov models; Mel frequency cepstral coefficient; Mutual information; Speech; Feature combination; Information bottleneck principle; Speaker diarization; TDOA features; diarization system combination;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947334
  • Filename
    5947334