DocumentCode
527796
Title
Robust character based tagging with domain lexical features for Chinese spoken language understanding
Author
Bao, Changchun ; Li, Yali ; Li, Ta ; Pan, Jielin ; Yan, Yonghong
Author_Institution
ThinkIT Lab., Chinese Acad. of Sci., Beijing, China
Volume
7
fYear
2010
fDate
10-12 Aug. 2010
Firstpage
3410
Lastpage
3414
Abstract
Word information is useful in natural language understanding. But in Chinese language processing, word information is not given natural. While word-segmentation works well for text in NLU, it deteriorates Chinese SLU because of the flexibility and distortion of spoken utterance plus ASR errors. This paper propose a novel approach, sub-word features, to take use word information and help understanding spoken utterance while retain the robustness of character-wise processing. By means of this approach, we can also effectively use named entity list to improve SLU performance. Experiments show that the sub-word features give an average of 0.7 improvement for ASR, and the usage of named list given an average of 4.7 improvement.
Keywords
natural language processing; text analysis; word processing; ASR error; Chinese spoken language; character wise processing; domain lexical feature; robust character based tagging; spoken utterance understanding; sub word feature; word information; word segmentation; Noise; Robustness; Semantics; Speech; Speech recognition; Tagging; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Computation (ICNC), 2010 Sixth International Conference on
Conference_Location
Yantai, Shandong
Print_ISBN
978-1-4244-5958-2
Type
conf
DOI
10.1109/ICNC.2010.5584353
Filename
5584353
Link To Document