Title :
Conditional Random Fields for Integrating Local Discriminative Classifiers
Author :
Morris, Jeremy ; Fosler-Lussier, Eric
Author_Institution :
Ohio State Univ., Columbus
fDate :
3/1/2008 12:00:00 AM
Abstract :
Conditional random fields (CRFs) are a statistical framework that has recently gained in popularity in both the automatic speech recognition (ASR) and natural language processing communities because of the different nature of assumptions that are made in predicting sequences of labels compared to the more traditional hidden Markov model (HMM). In the ASR community, CRFs have been employed in a method similar to that of HMMs, using the sufficient statistics of input data to compute the probability of label sequences given acoustic input. In this paper, we explore the application of CRFs to combine local posterior estimates provided by multilayer perceptrons (MLPs) corresponding to the frame-level prediction of phone classes and phonological attribute classes. We compare phonetic recognition using CRFs to an HMM system trained on the same input features and show that the monophone label CRF is able to achieve superior performance to a monophone-based HMM and performance comparable to a 16 Gaussian mixture triphone-based HMM; in both of these cases, the CRF obtains these results with far fewer free parameters. The CRF is also able to better combine these posterior estimators, achieving a substantial increase in performance over an HMM-based triphone system by mixing the two highly correlated sets of phone class and phonetic attribute class posteriors.
Keywords :
maximum likelihood estimation; multilayer perceptrons; random processes; speech recognition; Gaussian mixture triphone-based HMM; HMM system; MLP; automatic speech recognition; conditional random fields; frame-level prediction; integrating local discriminative classifiers; local posterior estimates; monophone-based HMM; multilayer perceptrons; natural language processing; phonetic recognition; Automatic speech recognition; Communities; Decorrelation; Feature extraction; Hidden Markov models; Multilayer perceptrons; Natural language processing; Probability; Speech recognition; Statistics; Automatic speech recognition (ASR); random fields;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2008.916057