Title :
Partial change accent models for accented Mandarin speech recognition
Author :
Yi, Liu ; Fung, Pascale
Author_Institution :
Dept. of Electr. & Electron. Eng., Univ. of Sci. & Technol., Hong Kong, China
fDate :
30 Nov.-3 Dec. 2003
Abstract :
Regional accents in Mandarin speech result mostly from partial phone changes due to the interlanguage system of non-native speakers. We propose partial change accent models based on accent-specific units with acoustic model reconstruction for accented Mandarin speech recognition. We use phonological rules of dialectical pronunciations together with likelihood ratio test to model actual accented variants rather than inherent phonetic confusions, recognizer errors or other data-specific variations. In order to avoid model confusion and lexical confusion with the increased unit inventory, we improve model resolution through reconstructing the pre-trained acoustic model by using the Gaussian mixtures from accent-specific unit models, where the accent-specific units are treated as hidden models. The effectiveness of this approach is evaluated on Cantonese accented Mandarin speech. Our proposed method yields a significant 4.4 % absolute word error rate (WER) reduction without sacrificing the performance of native speech recognition task. Our reconstructed model can be applied to a single system to handle both accented and native speech.
Keywords :
Gaussian distribution; error statistics; hidden Markov models; maximum likelihood estimation; speech processing; speech recognition; Cantonese; Gaussian mixtures; WER reduction; accent-specific unit models; accented Mandarin speech recognition; acoustic model reconstruction; dialectical pronunciations; hidden models; interlanguage system; lexical confusion; likelihood ratio test; model confusion; model resolution; nonnative speakers; partial change accent models; partial phone changes; phonological rules; pre-trained acoustic model; regional accents; word error rate; Acoustic testing; Acoustical engineering; Automatic speech recognition; Dictionaries; Error analysis; Humans; Loudspeakers; Natural languages; Speech analysis; Speech recognition;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318413