Title :
Robust understanding of spoken Chinese through character-based tagging and prior knowledge exploitation
Author :
Xu, Weiqun ; Bao, Changchun ; Li, Yali ; Pan, Jielin ; Yan, Yonghong
Author_Institution :
Key Lab. of Speech Acoust. & Content Understanding, Inst. of Acoust., Beijing, China
Abstract :
Robustness is one of the most challenging issues for spoken language understanding (SLU). In this paper we studied the semantic understanding of Chinese spoken language for a voice search dialogue system. We first simplified the problem of semantic understanding into a named entity recognition (NER) task, which was further formulated as sequential tagging. We carried out experiments to opt for character over word as the tagging unit. Then two approaches were proposed to exploit prior knowledge - in the form of a domain lexicon - into the character-based tagging framework. One enriched tagger features by incorporating more formal lexical features with a domain lexicon. The other made plain use of domain entities by simply adding them to the training data. Experiment results show that both approaches are effective. The best performance is achieved by combining the above two complimentary approaches. By exploiting prior knowledge we improved the NER performance from 75.27 to 90.24 in F1 score on a field test set using speech recognizer output.
Keywords :
natural language processing; speech processing; Chinese spoken language; character-based tagging framework; domain lexicon; formal lexical features; named entity recognition; prior knowledge exploitation; robust understanding; semantic understanding; sequential tagging; spoken Chinese; spoken language understanding; voice search dialogue system; Hidden Markov models; Robustness; Semantics; Speech; Speech recognition; Tagging; Training data;
Conference_Titel :
Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
Conference_Location :
Waikoloa, HI
Print_ISBN :
978-1-4673-0365-1
Electronic_ISBN :
978-1-4673-0366-8
DOI :
10.1109/ASRU.2011.6163967