DocumentCode :
1693342
Title :
Non-linear frequency warping for VTLN using subglottal resonances and the third formant frequency
Author :
Arsikere, Harish ; Lulich, Steven M. ; Alwan, Abeer
Author_Institution :
Dept. of Electr. Eng., Univ. of California, Los Angeles, Los Angeles, CA, USA
fYear :
2013
Firstpage :
7922
Lastpage :
7926
Abstract :
This paper proposes a non-linear frequency warping scheme for VTLN. It is based on mapping the subglottal resonances (SGRs) and the third formant frequency (F3) of a given utterance to those of a reference speaker. SGRs are used because they relate to formants in specific ways while remaining phonetically invariant, and F3 is used because it is somewhat correlated to vocal-tract length. Given an utterance, the warping parameters (SGRs and F3) are determined by obtaining initial estimates from the signal, and refining the estimates with respect to a speaker-independent model. For children (TIDIGITS), the proposed method yields statistically-significant word error rate (WER) reductions (up to 15%) relative to conventional VTLN (linear warping) when: (1) speakers show poor baseline performance, and/or (2) training data are limited. For adults (Wall Street Journal), the WER reduction relative to conventional VTLN is 4-5%. Comparison with other non-linear warping techniques is also reported.
Keywords :
speech processing; SGR mapping; TIDIGITS; VTLN; WER reduction; Wall Street Journal; linear warping; nonlinear frequency warping scheme; speaker-independent model; subglottal resonance mapping; subglottal resonances; third formant frequency mapping; vocal-tract length normalization; word error rate reduction; Estimation; Hidden Markov models; Speech; Speech recognition; Testing; Training; Training data; non-linear frequency warping; subglottal resonances; third formant; vocal-tract length normalization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6639207
Filename :
6639207
Link To Document :
بازگشت