GFM-Based Methods for Speaker Identification

Author

Bhardwaj, Shashank ; Srivastava, Sanjeev ; Hanmandlu, M. ; Gupta, J.R.P.

Author_Institution

Netaji Subhas Inst. of Technol., Univ. of Delhi, New Delhi, India

Volume

43

Issue

3

fYear

2013

fDate

Jun-13

Firstpage

1047

Lastpage

1058

Abstract

This paper presents three novel methods for speaker identification of which two methods utilize both the continuous density hidden Markov model (HMM) and the generalized fuzzy model (GFM), which has the advantages of both Mamdani and Takagi-Sugeno models. In the first method, the HMM is utilized for the extraction of shape-based batch feature vector that is fitted with the GFM to identify the speaker. On the other hand, the second method makes use of the Gaussian mixture model (GMM) and the GFM for the identification of speakers. Finally, the third method has been inspired by the way humans cash in on the mutual acquaintances while identifying a speaker. To see the validity of the proposed models [HMM-GFM, GMM-GFM, and HMM-GFM (fusion)] in a real-life scenario, they are tested on VoxForge speech corpus and on the subset of the 2003 National Institute of Standards and Technology evaluation data set. These models are also evaluated on the corrupted VoxForge speech corpus by mixing with different types of noisy signals at different values of signal-to-noise ratios, and their performance is found superior to that of the well-known models.

Keywords

Gaussian processes; feature extraction; fuzzy set theory; hidden Markov models; speaker recognition; GFM-based methods; GMM-GFM model; Gaussian mixture model; HMM-GFM fusion model; Mamdani model; National Institute of Standards and Technology evaluation data set; Takagi-Sugeno model; VoxForge speech corpus; continuous density hidden Markov model; generalized fuzzy model; mutual acquaintances; noisy signals; shape-based batch feature vector extraction; signal-to-noise ratios; speaker identification; Correlation; Feature extraction; Hidden Markov models; Shape; Speaker recognition; Speech; Vectors; Gaussian mixture model (GMM); generalized fuzzy model (GFM); hidden Markov model (HMM); shape-based batching (SBB); Algorithms; Artificial Intelligence; Biometry; Data Interpretation, Statistical; Fuzzy Logic; Humans; Information Storage and Retrieval; Markov Chains; Normal Distribution; Pattern Recognition, Automated; Speech Production Measurement;

fLanguage

English

Journal_Title

Cybernetics, IEEE Transactions on

Publisher

ieee

ISSN

2168-2267

Type

jour

DOI

10.1109/TSMCB.2012.2223461

Filename

6341116