DocumentCode :
1690183
Title :
Improved mixed language speech recognition using asymmetric acoustic model and language model with code-switch inversion constraints
Author :
Ying Li ; Fung, Pascale
Author_Institution :
Dept. of Electron. & Comput. Eng., Hong Kong Univ. of Sci. & Technol., Hong Kong, China
fYear :
2013
Firstpage :
7368
Lastpage :
7372
Abstract :
We propose an integrated framework for large vocabulary continuous mixed language speech recognition that handles the accent effect in the bilingual acoustic model and the inversion constraint well known to linguists in the language model. Our asymmetric acoustic model with phone set extension improves upon previous work by striking a balance between data and phonetic knowledge. Our language model improves upon previous work by (1) using the inversion constraint to predict code switching points in the mixed language and (2) integrating a code-switch prediction model, a translation model and a reconstruction model together. This integration means that our language model avoids the pitfall of propagated error that could arise from decoupling these steps. Finally, a WFST-based decoder integrates the acoustic models, code-switch language model and a monolingual language model in the matrix language all together. Our system reduces word error rate by 1.88% on a lecture speech corpus and by 2.43% on a lunch conversation corpus, with statistical significance, over the conventional bilingual acoustic model and interpolated language model.
Keywords :
natural language processing; speech coding; speech recognition; WFST-based decoder; accent effect; asymmetric acoustic model; bilingual acoustic model; code switching points; code-switch inversion constraints; code-switch language model; code-switch prediction model; interpolated language model; large vocabulary continuous mixed language speech recognition; lecture speech corpus; lunch conversation corpus; matrix language; monolingual language model; phone set extension; phonetic knowledge; reconstruction model; translation model; word error rate; Acoustics; Adaptation models; Data models; Hidden Markov models; Predictive models; Speech; Speech recognition; mixed language; multilingual speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6639094
Filename :
6639094
Link To Document :
بازگشت