• DocumentCode
    2970871
  • Title

    Robust Speaker Diarization for short speech recordings

  • Author

    Imseng, David ; Friedland, Gerald

  • Author_Institution
    Idiap Res. Inst., Martigny, Switzerland
  • fYear
    2009
  • fDate
    Nov. 13 2009-Dec. 17 2009
  • Firstpage
    432
  • Lastpage
    437
  • Abstract
    We investigate a state-of-the-art speaker diarization system regarding its behavior on meetings that are much shorter (from 500 seconds down to 100 seconds) than those typically analyzed in speaker diarization benchmarks. First, the problems inherent to this task are analyzed. Then, we propose an approach that consists of a novel initialization parameter estimation method for typical state-of-the-art diarization approaches. The estimation method balances the relationship between the optimal value of the duration of speech data per Gaussian and the duration of the speech data, which is verified experimentally for the first time in this article. As a result, the diarization error rate for short meetings extracted from the 2006, 2007, and 2009 NIST RT evaluation data is decreased by up to 50% relative.
  • Keywords
    Gaussian distribution; parameter estimation; speaker recognition; diarization error rate; initialization parameter estimation method; robust speaker diarization system; short speech recordings; speech data per Gaussian; Bayesian methods; Cepstral analysis; Computer science; Data mining; Error analysis; NIST; Parameter estimation; Robustness; Speech; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
  • Conference_Location
    Merano
  • Print_ISBN
    978-1-4244-5478-5
  • Electronic_ISBN
    978-1-4244-5479-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2009.5373254
  • Filename
    5373254