• DocumentCode
    3244047
  • Title

    Speaker recognition using prosodic and lexical features

  • Author

    Kajarekar, Sachin ; Ferrer, Luciana ; Venkataraman, Anand ; Sonmez, Kemal ; Shriberg, Elizabeth ; Stolcke, Andreas ; Bratt, Harry ; Gadde, Ramana Rao

  • Author_Institution
    SRI Int., Menlo Park, CA, USA
  • fYear
    2003
  • fDate
    30 Nov.-3 Dec. 2003
  • Firstpage
    19
  • Lastpage
    24
  • Abstract
    Conventional speaker recognition systems identify speakers by using spectral information from very short slices of speech. Such systems perform well (especially in quiet conditions), but fail to capture idiosyncratic longer-term patterns in a speaker´s habitual speaking style, including duration and pausing patterns, intonation contours, and the use of particular phrases. We investigate the contribution of modeling such prosodic and lexical patterns, on performance in the NIST 2003 Speaker Recognition Evaluation extended data task. We report results for: (1) systems based on individual feature types alone; (2) systems in combination with a state-of-the-art frame-based baseline system; (3) an all-system combination. Our results show that certain longer-term stylistic features provide powerful complementary information to both frame-level cepstral features and to each other. Stylistic features thus significantly improve speaker recognition performance over conventional systems, and offer promise for a variety of intelligence and security applications.
  • Keywords
    cepstral analysis; natural languages; speaker recognition; NIST 2003 Speaker Recognition Evaluation; cepstral features; duration patterns; intelligence; intonation contours; lexical features; pausing patterns; prosodic features; security; speaking style; spectral information; stylistic features; Cepstral analysis; Computer science; Databases; Information security; Loudspeakers; NIST; Power system modeling; Power system security; Speaker recognition; Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
  • Print_ISBN
    0-7803-7980-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2003.1318397
  • Filename
    1318397