DocumentCode
1749685
Title
Speaker identification using Gaussian mixture models based on multi-space probability distribution
Author
Miyajima, Chiyomi ; Hattori, Yoshiyuki ; Tokuda, Keiichi ; Masuko, Takashi ; Kobayashi, Takao ; Kitamura, Tadashi
Author_Institution
Nagoya Inst. of Technol., Japan
Volume
1
fYear
2001
fDate
2001
Firstpage
433
Abstract
Presents an approach to modeling speech spectra and pitch for text-independent speaker identification using Gaussian mixture models based on multi-space probability distribution (MSD-GMM). The MSD-GMM allows us to model continuous pitch values for voiced frames and discrete symbols representing unvoiced frames in a unified framework. Spectral and pitch features are jointly modeled by a two-stream MSD-GMM. We derive maximum likelihood estimation formulae for the MSD-GMM parameters, and the MSD-GMM speaker models are evaluated for text-independent speaker identification tasks. Experimental results, show that the MSD-GMM can efficiently model spectral and pitch features of each speaker and outperforms conventional speaker models
Keywords
maximum likelihood estimation; probability; speaker recognition; Gaussian mixture models; continuous pitch values; discrete symbols; maximum likelihood estimation; multi-space probability distribution; speech pitch; speech spectra; text-independent speaker identification; unvoiced frames; voiced frames; Cepstral analysis; Maximum likelihood estimation; Probability distribution; Societies; Speaker recognition; Speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location
Salt Lake City, UT
ISSN
1520-6149
Print_ISBN
0-7803-7041-4
Type
conf
DOI
10.1109/ICASSP.2001.940860
Filename
940860
Link To Document