DocumentCode :
1687023
Title :
Prosodic features and formant modeling for an ivector-based language recognition system
Author :
Martinez, D. ; Lleida, Eduardo ; Ortega, Antonio ; Miguel, A.
Author_Institution :
Aragon Inst. for Eng. Res. (I3A), Univ. of Zaragoza, Zaragoza, Spain
fYear :
2013
Firstpage :
6847
Lastpage :
6851
Abstract :
The prosody of a language is encoded in syllable length, loudness and pitch. These attributes make humans perceive rhythm, stress and intonation in speech. Depending on the language, these speech properties vary, making language classification possible. On the other hand, formants are the resonance frequencies of the vocal tract, depend heavily on the position adopted by the articulatory organs, and are especially useful to disambiguate vowel sounds. In this paper prosodic and formant information are combined to build a generative language identification system based on Gaussian models fed with iVectors. The system is evaluated on the NIST LRE09 database and the inclusion of formant information gives about 50% relative improvement for the 30 s task over a prosodic system without it. The fusion with a state-of-the-art acoustic system based on shifted delta cepstral coefficients (SDC) shows the complementarity of both approaches.
Keywords :
Gaussian processes; acoustic signal processing; cepstral analysis; natural language processing; speech recognition; Gaussian models; NIST LRE09 database; SDC coefficients; articulatory organs; formant information; formant modeling; generative language identification system; iVector-based language recognition system; language classification; loudness; pitch; prosodic features; shifted delta cepstral coefficients; speech intonation perception; speech rhythm perception; speech stress perception; state-of-the-art acoustic system; syllable length; vocal tract; vowel sound disambiguation; Cepstral analysis; Feature extraction; NIST; Polynomials; Speech; Stress; Formants; Joint Factor Analysis; Language Identification; Prosody; iVectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6638988
Filename :
6638988
Link To Document :
بازگشت