DocumentCode :
734995
Title :
Multi-pronounciation dictionary construction for Mandarin-English bilingual phrase speech recognition system
Author :
Wang, C. ; Shi, W. ; Zou, Y.X.
Author_Institution :
Sch. of ECE, Peking Univ., Shenzhen, China
fYear :
2015
fDate :
12-15 July 2015
Firstpage :
15
Lastpage :
19
Abstract :
Generally, in multi-lingual communities, non-native speakers may produce speech sound which is either part of their own native language or established via merging characteristics of native pronunciation with non-native pronunciation. Recently, a Two-pass phone clustering based on Confusion Matrix (TCM) approach has been proposed to address the one-to-one phone mappings between Chinese syllables and English phones using standard Chinese and English data. In this paper, we extend TCM to the one-to-many phone mappings issue since there is the merging phenomenon of native and non-native pronunciation in bilingual speeches. Employing a knowledge-based phone set to TCM as supplements for phone clustering, a novel method termed as the TCM with Initialization and Updating of the Phone Set method (TCM-IUPS). As a result, the pronunciation dictionary is built via using the information learned by our proposed TCM-IUPS as well as canonical pronunciation. Experiments show that, compared with TCM, the Phrase Error Rate (PhrER) of TCM-IUPS is reduced by 5.27% in bilingual testing corpora and 26.09% in mono-English testing corpora compared with TCM, while the same performance is maintained in mono-Mandarin testing corpora.
Keywords :
error statistics; matrix algebra; natural language processing; pattern clustering; speech recognition; Chinese data; Chinese syllables; English data; English phones; Mandarin-English bilingual phrase speech recognition system; PhrER; TCM approach; TCM-IUPS; bilingual speeches; bilingual testing corpora; canonical pronunciation; confusion matrix; initialization and updating of the phone set method; knowledge-based phone set; mono-English testing corpora; mono-Mandarin testing corpora; multilingual communities; multipronounciation dictionary construction; native language; nonnative pronunciation; nonnative speakers; one-to-many phone mappings; one-to-one phone mappings; phrase error rate; speech sound; two-pass phone clustering; Decision support systems; Dictionaries; Feature extraction; Handheld computers; Indexes; Speech; Testing; Accent issue; bilingual speech recognition; initialization and updating of the phone set; multi-pronunciation dictionary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on
Conference_Location :
Chengdu
Type :
conf
DOI :
10.1109/ChinaSIP.2015.7230353
Filename :
7230353
Link To Document :
بازگشت