DocumentCode
3485847
Title
Robust understanding of spoken Chinese through character-based tagging and prior knowledge exploitation
Author
Xu, Weiqun ; Bao, Changchun ; Li, Yali ; Pan, Jielin ; Yan, Yonghong
Author_Institution
Key Lab. of Speech Acoust. & Content Understanding, Inst. of Acoust., Beijing, China
fYear
2011
fDate
11-15 Dec. 2011
Firstpage
413
Lastpage
418
Abstract
Robustness is one of the most challenging issues for spoken language understanding (SLU). In this paper we studied the semantic understanding of Chinese spoken language for a voice search dialogue system. We first simplified the problem of semantic understanding into a named entity recognition (NER) task, which was further formulated as sequential tagging. We carried out experiments to opt for character over word as the tagging unit. Then two approaches were proposed to exploit prior knowledge - in the form of a domain lexicon - into the character-based tagging framework. One enriched tagger features by incorporating more formal lexical features with a domain lexicon. The other made plain use of domain entities by simply adding them to the training data. Experiment results show that both approaches are effective. The best performance is achieved by combining the above two complimentary approaches. By exploiting prior knowledge we improved the NER performance from 75.27 to 90.24 in F1 score on a field test set using speech recognizer output.
Keywords
natural language processing; speech processing; Chinese spoken language; character-based tagging framework; domain lexicon; formal lexical features; named entity recognition; prior knowledge exploitation; robust understanding; semantic understanding; sequential tagging; spoken Chinese; spoken language understanding; voice search dialogue system; Hidden Markov models; Robustness; Semantics; Speech; Speech recognition; Tagging; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location
Waikoloa, HI
Print_ISBN
978-1-4673-0365-1
Electronic_ISBN
978-1-4673-0366-8
Type
conf
DOI
10.1109/ASRU.2011.6163967
Filename
6163967
Link To Document