DocumentCode :
865733
Title :
Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error
Author :
McDermott, Erik ; Hazen, Timothy J. ; Le Roux, Jonathan ; Nakamura, Atsushi ; Katagiri, Shigeru
Author_Institution :
NTT Commun. Sci. Labs., Kyoto
Volume :
15
Issue :
1
fYear :
2007
fDate :
6/29/1905 12:00:00 AM
Firstpage :
203
Lastpage :
223
Abstract :
The minimum classification error (MCE) framework for discriminative training is a simple and general formalism for directly optimizing recognition accuracy in pattern recognition problems. The framework applies directly to the optimization of hidden Markov models (HMMs) used for speech recognition problems. However, few if any studies have reported results for the application of MCE training to large-vocabulary, continuous-speech recognition tasks. This article reports significant gains in recognition performance and model compactness as a result of discriminative training based on MCE training applied to HMMs, in the context of three challenging large-vocabulary (up to 100 k word) speech recognition tasks: the Corpus of Spontaneous Japanese lecture speech transcription task, a telephone-based name recognition task, and the MIT Jupiter telephone-based conversational weather information task. On these tasks, starting from maximum likelihood (ML) baselines, MCE training yielded relative reductions in word error ranging from 7% to 20%. Furthermore, this paper evaluates the use of different methods for optimizing the MCE criterion function, as well as the use of precomputed recognition lattices to speed up training. An overview of the MCE framework is given, with an emphasis on practical implementation issues
Keywords :
hidden Markov models; maximum likelihood estimation; speech processing; speech recognition; MIT JUPITER telephone-based conversational weather information task; continuous-speech recognition; discriminative training; hidden Markov models; large-vocabulary speech recognition; maximum likelihood baselines; minimum classification error; pattern recognition; spontaneous Japanese lecture speech transcription task; telephone-based name recognition task; Computer science; Hidden Markov models; Laboratories; Large-scale systems; Mutual information; Natural languages; Optimization methods; Pattern recognition; Performance gain; Speech recognition; Discriminative training; pattern recognition; speech recognition;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2006.876778
Filename :
4032780
Link To Document :
بازگشت