Title :
Prosodic attribute model for spoken language identification
Author :
Ng, Raymond W M ; Leung, Cheung-Chi ; Lee, Tan ; Ma, Bin ; Li, Haizhou
Author_Institution :
Dept. of Electron. Eng., Chinese Univ. of Hong Kong, Hong Kong, China
Abstract :
Prosodic information is believed to carry language-specific information useful to spoken language recognition. Modeling prosodic features is a challenging problem, on which a wide diversity of approaches have been investigated. In this paper, a novel prosodic attribute model (PAM) is proposed to capture prosodic features with compact models. It models the language-specific co-occurrence statistics of a comprehensive set of prosodic features. When the prosodic LID system with PAM is evaluated in NIST Language Recognition Evaluations (LRE) 2007 and 2009, it demonstrates respectively 21% and 11% relative EER reduction compared to a phonotactic LID system. The contributions of prosodic features in detecting some of the target languages, including tonal languages, are even more substantial. It is also noted that most prosodic attributes in the comprehensive set are making positive contributions.
Keywords :
natural language processing; speech processing; speech recognition; statistics; NIST Language Recognition Evaluation; language specific cooccurrence statistics; language specific information; prosodic LID system; prosodic attribute model; prosodic information; spoken language identification; spoken language recognition; tonal languages; Automatic speech recognition; Cepstral analysis; Computer science; Computer vision; Error analysis; Feature extraction; Mel frequency cepstral coefficient; NIST; Natural languages; Statistics; Prosody; language identification;
Conference_Titel :
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location :
Dallas, TX
Print_ISBN :
978-1-4244-4295-9
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2010.5495070