Partial change accent models for accented Mandarin speech recognition

Author

Yi, Liu ; Fung, Pascale

Author_Institution

Dept. of Electr. & Electron. Eng., Univ. of Sci. & Technol., Hong Kong, China

fYear

2003

fDate

30 Nov.-3 Dec. 2003

Firstpage

111

Lastpage

116

Abstract

Regional accents in Mandarin speech result mostly from partial phone changes due to the interlanguage system of non-native speakers. We propose partial change accent models based on accent-specific units with acoustic model reconstruction for accented Mandarin speech recognition. We use phonological rules of dialectical pronunciations together with likelihood ratio test to model actual accented variants rather than inherent phonetic confusions, recognizer errors or other data-specific variations. In order to avoid model confusion and lexical confusion with the increased unit inventory, we improve model resolution through reconstructing the pre-trained acoustic model by using the Gaussian mixtures from accent-specific unit models, where the accent-specific units are treated as hidden models. The effectiveness of this approach is evaluated on Cantonese accented Mandarin speech. Our proposed method yields a significant 4.4 % absolute word error rate (WER) reduction without sacrificing the performance of native speech recognition task. Our reconstructed model can be applied to a single system to handle both accented and native speech.

Keywords

Gaussian distribution; error statistics; hidden Markov models; maximum likelihood estimation; speech processing; speech recognition; Cantonese; Gaussian mixtures; WER reduction; accent-specific unit models; accented Mandarin speech recognition; acoustic model reconstruction; dialectical pronunciations; hidden models; interlanguage system; lexical confusion; likelihood ratio test; model confusion; model resolution; nonnative speakers; partial change accent models; partial phone changes; phonological rules; pre-trained acoustic model; regional accents; word error rate; Acoustic testing; Acoustical engineering; Automatic speech recognition; Dictionaries; Error analysis; Humans; Loudspeakers; Natural languages; Speech analysis; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

Print_ISBN

0-7803-7980-2

Type

conf

DOI

10.1109/ASRU.2003.1318413

Filename

1318413