DocumentCode :
3423312
Title :
Mandarin-English bilingual Speech Recognition for real world music retrieval
Author :
Zhang, Qingqing ; Pan, Jielin ; Yan, Yonghong
Author_Institution :
Inst. of Acoust., Chinese Acad. of Sci., Beijing
fYear :
2008
fDate :
March 31 2008-April 4 2008
Firstpage :
4253
Lastpage :
4256
Abstract :
This paper presents our recent work on the development of a grammar-constrained, Mandarin-English bilingual speech recognition system (MESRS) for real world music retrieval. In order to balance the performance and the complexity of the bilingual SR system, an unified single set of bilingual acoustic models derived by phone clustering is developed. A novel two-pass phone clustering method based on confusion matrix (TCM) is presented and compared with the log-likelihood measure method. In order to deal with the Mandarin accent in spoken English, different non-native adaptation approaches are investigated. With the effective incorporation of approaches on phone clustering and non-native adaptation, the phrase error rate (PhrER) of MESRS for English utterances was reduced by 24.5% relatively compared to the baseline monolingual English system while the PhrER on Mandarin utterances was comparable to that of the baseline monolingual Mandarin system, and the performance for bilingual code-mixing utterances achieved 22.4% relative PhrER reduction.
Keywords :
information retrieval; matrix algebra; natural languages; speech recognition; Mandarin-English bilingual speech recognition system; baseline monolingual English system; baseline monolingual Mandarin system; bilingual acoustic models; bilingual code-mixing utterances; confusion matrix; log-likelihood measure method; nonnative adaptation approach; phrase error rate; real world music retrieval; two-pass phone clustering method; Clustering methods; Degradation; Error analysis; Loudspeakers; Music information retrieval; Natural languages; Speech analysis; Speech recognition; Strontium; Training data; Bilingual speech recognition; clustering methods; information retrieval;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
ISSN :
1520-6149
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2008.4518594
Filename :
4518594
Link To Document :
بازگشت