• DocumentCode
    1749685
  • Title

    Speaker identification using Gaussian mixture models based on multi-space probability distribution

  • Author

    Miyajima, Chiyomi ; Hattori, Yoshiyuki ; Tokuda, Keiichi ; Masuko, Takashi ; Kobayashi, Takao ; Kitamura, Tadashi

  • Author_Institution
    Nagoya Inst. of Technol., Japan
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    433
  • Abstract
    Presents an approach to modeling speech spectra and pitch for text-independent speaker identification using Gaussian mixture models based on multi-space probability distribution (MSD-GMM). The MSD-GMM allows us to model continuous pitch values for voiced frames and discrete symbols representing unvoiced frames in a unified framework. Spectral and pitch features are jointly modeled by a two-stream MSD-GMM. We derive maximum likelihood estimation formulae for the MSD-GMM parameters, and the MSD-GMM speaker models are evaluated for text-independent speaker identification tasks. Experimental results, show that the MSD-GMM can efficiently model spectral and pitch features of each speaker and outperforms conventional speaker models
  • Keywords
    maximum likelihood estimation; probability; speaker recognition; Gaussian mixture models; continuous pitch values; discrete symbols; maximum likelihood estimation; multi-space probability distribution; speech pitch; speech spectra; text-independent speaker identification; unvoiced frames; voiced frames; Cepstral analysis; Maximum likelihood estimation; Probability distribution; Societies; Speaker recognition; Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.940860
  • Filename
    940860