• DocumentCode
    3530128
  • Title

    Modeling instantaneous intonation for speaker identification using the fundamental frequency variation spectrum

  • Author

    Laskowski, Kornel ; Jin, Qin

  • Author_Institution
    interACT, Carnegie Mellon Univ., Pittsburgh, PA
  • fYear
    2009
  • fDate
    19-24 April 2009
  • Firstpage
    4541
  • Lastpage
    4544
  • Abstract
    In recent years, the field of automatic speaker identification has begun to exploit high-level sources of speaker-discriminative information, in addition to traditional models of spectral shape. These sources include pronunciation models, prosodic dynamics, pitch, pause, and duration features, phone streams, and conversational interaction. As part of this broader thrust, we explore a new frame-level vector representation of the instantaneous change in fundamental frequency, known as fundamental frequency variation (FFV). The FFV spectrum consists of 7 continuous coefficients, and can be directly modeled in a standard Gaussian mixture model (GMM) framework. Our experiments indicate that FFV features contain useful information for discriminating among speakers, and that model-space combination of FFV and cepstral features outperforms cepstral features alone. In particular, our results on 16 kHz Wall Street Journal data show relative reductions in error rate of 54% and 40% for female and male speakers, respectively.
  • Keywords
    Gaussian processes; cepstral analysis; speaker recognition; Gaussian mixture model; cepstral feature; conversational interaction; fundamental frequency variation spectrum; phone stream; pronunciation model; speaker identification; vector representation; Anechoic chambers; Cepstral analysis; Cepstrum; Disk recording; Error analysis; Flexible printed circuits; Frequency; Performance gain; Spectral shape; Statistics; Fundamental frequency; Intonation; Speaker identification;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
  • Conference_Location
    Taipei
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-2353-8
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2009.4960640
  • Filename
    4960640