• DocumentCode
    3528192
  • Title

    Effective metric-based speaker segmentation in the frequency domain

  • Author

    Boehm, Christoph ; Pernkopf, Franz

  • Author_Institution
    Signal Process. & Speech Commun. Lab., Graz Univ. of Technol., Graz
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4081
  • Lastpage
    4084
  • Abstract
    In this paper, we present an approach, called FREQDIST, for speaker segmentation based on a distance measurement applied in the frequency domain. To enhance the detection performance, the spectrum is reweighted using normalization techniques. Additionally, noise-like (i.e. flat) spectra are removed based on the entropy. Experiments using the TIMIT database [1] and Westdeutscher Rundfunk broadcast data show that our segmentation approach yields a good performance compared to the DISTBIC algorithm [2]. In particular, for the TIMIT data our algorithm reaches a false alarm rate (FAR) less than half of the value of the DISTBIC algorithm and a missed detection rate (MDR) of 7.0% instead of 13.1%.
  • Keywords
    frequency-domain analysis; speaker recognition; false alarm rate; frequency domain; metric-based speaker segmentation; normalization techniques; Broadcasting; Distance measurement; Feature extraction; Frequency domain analysis; Loudspeakers; Mel frequency cepstral coefficient; Resonant frequency; Signal processing; Signal processing algorithms; Speech; DISTBIC; FREQDIST; Speaker turn detection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960525
  • Filename
    4960525