Title :
Candidate expansion algorithm based on weighted syllable confusion matrix for Mandarin LVCSR
Author :
Chang Fengxiang ; Li Baoxiang ; Liu Gang ; Guo Jun
Author_Institution :
Sch. of Inf. & Commun. Eng., Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
The inclusion of more potentially correct words in the candidate sets is important to improve the accuracy of Large Vocabulary Continuous Speech Recognition (LVCSR). A candidate expansion algorithm based on the Weighted Syllable Confusion Matrix (WSCM) is proposed. First, WSCM is derived from a confusion network. Then, the recognised candidates in the confusion network is used to conjecture the most likely correct words based on WSCM, after which, the conjectured words are combined with the recognised candidates to produce an expanded candidate set. Finally, a combined model having mutual information and a trigram language model is used to rerank the candidates. The experiments on Mandarin film data show that an improvement of 9.57% in the character correction rate is obtained over the initial recognition performance on those light erroneous utterances.
Keywords :
matrix algebra; natural language processing; speech recognition; Mandarin LVCSR; WSCM; candidate expansion algorithm; confusion network; conjectured words; expanded candidate set; large vocabulary continuous speech recognition; mutual information; trigram language model; weighted syllable confusion matrix; Acoustics; Data accuracy; Probability; Speech recognition; Training data; Vocabularies; candidate expansion; confusion matrix; speech recognition;
Journal_Title :
Communications, China
DOI :
10.1109/CC.2013.6571293