DocumentCode :
3430643
Title :
Vocal source features for bilingual speaker identification
Author :
Jianglin Wang ; Johnson, Matthew Thomas
Author_Institution :
Dept. of Electr. & Comput. Eng., Marquette Univ., Milwaukee, WI, USA
fYear :
2013
fDate :
6-10 July 2013
Firstpage :
170
Lastpage :
173
Abstract :
This paper introduces the use of two new features for speaker identification, Residual Phase Cepstrum Coefficients (RPCC) and Glottal Flow Cepstrum Coefficients (GLFCC), to capture speaker-specific characteristics from their vocal excitation patterns. Results on a cross-lingual speaker identification task taken from the NIST 2004 SRE demonstrate that these RPCC and GLFCC features are significantly more accurate than traditional mel-frequency cepstral coefficients (MFCC). In particular, these two new features give better results with smaller amounts of training data, due to lower model complexity.
Keywords :
Gaussian processes; maximum likelihood estimation; natural language processing; speaker recognition; GLFCC; GMM-UBM; Gaussian mixture model- universal background model; NIST 2004 SRE; RPCC; bilingual speaker identification; cross-lingual speaker identification task; glottal flow cepstrum coefficients; maximum a posteriori adaptation; model complexity; residual phase cepstrum coefficients; speaker-specific characteristics; vocal excitation patterns; vocal source features; Accuracy; Adaptation models; Feature extraction; Filtering; Mel frequency cepstral coefficient; Speaker recognition; Speech; Glottal source excitation; IAIF and GMM; Speaker identification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Information Processing (ChinaSIP), 2013 IEEE China Summit & International Conference on
Conference_Location :
Beijing
Type :
conf
DOI :
10.1109/ChinaSIP.2013.6625321
Filename :
6625321
Link To Document :
بازگشت