DocumentCode
2972303
Title
Hidden Conditional Random Fields for phone recognition
Author
Sung, Yun-hsuan ; Jurafsky, Dan
Author_Institution
Electr. Eng., Stanford Univ., Stanford, CA, USA
fYear
2009
fDate
Nov. 13 2009-Dec. 17 2009
Firstpage
107
Lastpage
112
Abstract
We apply Hidden Conditional Random Fields (HCRFs) to the task of TIMIT phone recognition. HCRFs are discriminatively trained sequence models that augment conditional random fields with hidden states that are capable of representing subphones and mixture components. We extend HCRFs, which had previously only been applied to phone classification with known boundaries, to recognize continuous phone sequences. We use an N-best inference algorithm in both learning (to approximate all competitor phone sequences) and decoding (to marginalize over hidden states). Our monophone HCRFs achieve 28.3% phone error rate, outperforming maximum likelihood trained HMMs by 3.6%, maximum mutual information trained HMMs by 2.5%, and minimum phone error trained HMMs by 2.2%. We show that this win is partially due to HCRFs´ ability to simultaneously optimize discriminative language models and acoustic models, a powerful property that has important implications for speech recognition.
Keywords
speech recognition; telephone sets; N-best inference algorithm; acoustic models; discriminative language models; hidden conditional random fields; phone error rate; phone recognition; speech recognition; Error analysis; Hidden Markov models; Inference algorithms; Labeling; Maximum likelihood decoding; Mutual information; Natural languages; Power system modeling; Shape; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition & Understanding, 2009. ASRU 2009. IEEE Workshop on
Conference_Location
Merano
Print_ISBN
978-1-4244-5478-5
Electronic_ISBN
978-1-4244-5479-2
Type
conf
DOI
10.1109/ASRU.2009.5373329
Filename
5373329
Link To Document