• DocumentCode
    527796
  • Title

    Robust character based tagging with domain lexical features for Chinese spoken language understanding

  • Author

    Bao, Changchun ; Li, Yali ; Li, Ta ; Pan, Jielin ; Yan, Yonghong

  • Author_Institution
    ThinkIT Lab., Chinese Acad. of Sci., Beijing, China
  • Volume
    7
  • fYear
    2010
  • fDate
    10-12 Aug. 2010
  • Firstpage
    3410
  • Lastpage
    3414
  • Abstract
    Word information is useful in natural language understanding. But in Chinese language processing, word information is not given natural. While word-segmentation works well for text in NLU, it deteriorates Chinese SLU because of the flexibility and distortion of spoken utterance plus ASR errors. This paper propose a novel approach, sub-word features, to take use word information and help understanding spoken utterance while retain the robustness of character-wise processing. By means of this approach, we can also effectively use named entity list to improve SLU performance. Experiments show that the sub-word features give an average of 0.7 improvement for ASR, and the usage of named list given an average of 4.7 improvement.
  • Keywords
    natural language processing; text analysis; word processing; ASR error; Chinese spoken language; character wise processing; domain lexical feature; robust character based tagging; spoken utterance understanding; sub word feature; word information; word segmentation; Noise; Robustness; Semantics; Speech; Speech recognition; Tagging; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Computation (ICNC), 2010 Sixth International Conference on
  • Conference_Location
    Yantai, Shandong
  • Print_ISBN
    978-1-4244-5958-2
  • Type

    conf

  • DOI
    10.1109/ICNC.2010.5584353
  • Filename
    5584353