• DocumentCode
    2021577
  • Title

    Concatenated phoneme models for text-variable speaker recognition

  • Author

    Matsui, Tomoko ; Furui, Sadaoki

  • Author_Institution
    NTT Human Interface Lab., Musashino-Shi, Tokyo, Japan
  • Volume
    2
  • fYear
    1993
  • fDate
    27-30 April 1993
  • Firstpage
    391
  • Abstract
    Methods that create models to specify both speaker and phonetic information accurately by using only a small amount of training data for each speaker are investigated. For a text-dependent speaker recognition method, in which arbitrary key texts are prompted from the recognizer, speaker-specific phoneme models are necessary to identify the key text and recognize the speaker. Two methods of making speaker-specific phoneme models are discussed: phoneme-adaptation of a phoneme-independent speaker model and speaker-adaptation of universal phoneme models. The authors also investigate supplementing these methods by adding a phoneme-independent speaker model to make up for the lack of speaker information. This combination achieves a rejection rate as high as 98.5% for speech that differs from the key text and a speaker verification rate of 100.0%.<>
  • Keywords
    learning (artificial intelligence); speech recognition; concatenated phoneme models; rejection rate; speaker verification rate; speaker-specific phoneme models; text-variable speaker recognition; training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
  • Conference_Location
    Minneapolis, MN, USA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7402-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.1993.319321
  • Filename
    319321