• DocumentCode
    3485847
  • Title

    Robust understanding of spoken Chinese through character-based tagging and prior knowledge exploitation

  • Author

    Xu, Weiqun ; Bao, Changchun ; Li, Yali ; Pan, Jielin ; Yan, Yonghong

  • Author_Institution
    Key Lab. of Speech Acoust. & Content Understanding, Inst. of Acoust., Beijing, China
  • fYear
    2011
  • fDate
    11-15 Dec. 2011
  • Firstpage
    413
  • Lastpage
    418
  • Abstract
    Robustness is one of the most challenging issues for spoken language understanding (SLU). In this paper we studied the semantic understanding of Chinese spoken language for a voice search dialogue system. We first simplified the problem of semantic understanding into a named entity recognition (NER) task, which was further formulated as sequential tagging. We carried out experiments to opt for character over word as the tagging unit. Then two approaches were proposed to exploit prior knowledge - in the form of a domain lexicon - into the character-based tagging framework. One enriched tagger features by incorporating more formal lexical features with a domain lexicon. The other made plain use of domain entities by simply adding them to the training data. Experiment results show that both approaches are effective. The best performance is achieved by combining the above two complimentary approaches. By exploiting prior knowledge we improved the NER performance from 75.27 to 90.24 in F1 score on a field test set using speech recognizer output.
  • Keywords
    natural language processing; speech processing; Chinese spoken language; character-based tagging framework; domain lexicon; formal lexical features; named entity recognition; prior knowledge exploitation; robust understanding; semantic understanding; sequential tagging; spoken Chinese; spoken language understanding; voice search dialogue system; Hidden Markov models; Robustness; Semantics; Speech; Speech recognition; Tagging; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2011 IEEE Workshop on
  • Conference_Location
    Waikoloa, HI
  • Print_ISBN
    978-1-4673-0365-1
  • Electronic_ISBN
    978-1-4673-0366-8
  • Type

    conf

  • DOI
    10.1109/ASRU.2011.6163967
  • Filename
    6163967