DocumentCode :
3423414
Title :
Speaker normalization based on subglottal resonances
Author :
Wang, Shizhen ; Alwan, Abeer ; Lulich, Steven M.
Author_Institution :
Dept. of Electr. Eng., Univ. of California at Los Angeles, Los Angeles, CA
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
4277
Lastpage :
4280
Abstract :
Speaker normalization typically focuses on variabilities of the supra-glottal (vocal tract) resonances, which constitute a major cause of spectral mismatch. Recent studies show that the subglottal airways also affect spectral properties of speech sounds. This paper presents a speaker normalization method based on estimating the second and third subglottal resonances. Since the subglottal airways do not change for a specific speaker, the subglottal resonances are independent of the sound type (i.e., vowel, consonant, etc.) and remain constant for a given speaker. This context-free property makes the proposed method suitable for limited data speaker adaptation. This method is computationally more efficient than maximum-likelihood based VTLN, with performance better than VTLN especially for limited adaptation data. Experimental results confirm that this method performs well in a variety of testing conditions and tasks.
Keywords :
speaker recognition; context-free property; data speaker adaptation; speaker normalization method; spectral mismatch; subglottal resonances; Frequency; Loudspeakers; Maximum likelihood detection; Maximum likelihood estimation; Oral communication; Performance evaluation; Resonance; Respiratory system; Speech; Testing; VTLN; speaker adaptation; speaker normalization; speech recognition; subglottal resonance;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518600
Filename :
4518600
Link To Document :
بازگشت