• DocumentCode
    591903
  • Title

    Statistical semantic interpretation modeling for spoken language understanding with enriched semantic features

  • Author

    Celikyilmaz, A. ; Hakkani-Tur, Dilek ; Tur, Gokhan

  • fYear
    2012
  • fDate
    2-5 Dec. 2012
  • Firstpage
    216
  • Lastpage
    221
  • Abstract
    In natural language human-machine statistical dialog systems, semantic interpretation is a key task typically performed following semantic parsing, and aims to extract canonical meaning representations of semantic components. In the literature, usually manually built rules are used for this task, even for implicitly mentioned non-named semantic components (like genre of a movie or price range of a restaurant). In this study, we present statistical methods for modeling interpretation, which can also benefit from semantic features extracted from large in-domain knowledge sources. We extract features from user utterances using a semantic parser and additional semantic features from textual sources (online reviews, synopses, etc.) using a novel tree clustering approach, to represent unstructured information that correspond to implicit semantic components related to targeted slots in the user´s utterances. We evaluate our models on a virtual personal assistance system and demonstrate that our interpreter is effective in that it does not only improve the utterance interpretation in spoken dialog systems (reducing the interpretation error rate by 36% relative compared to a language model baseline), but also unveils hidden semantic units that are otherwise nearly impossible to extract from purely manual lexical features that are typically used in utterance interpretation.
  • Keywords
    feature extraction; interactive systems; natural language processing; pattern clustering; speech recognition; speech synthesis; statistical analysis; canonical meaning representation extraction; implicit semantic components; lexical features; natural language human-machine statistical dialog systems; nonnamed semantic components; semantic feature extraction; semantic parsing; spoken dialog systems; spoken language understanding; statistical methods; statistical semantic interpretation modeling; textual sources; tree clustering approach; unstructured information representation; user utterances; utterance interpretation improvement; virtual personal assistance system; Data mining; Databases; Dictionaries; Engines; Feature extraction; Motion pictures; Semantics; graphical models; semantic interpretation; semi-supervised clustering; spoken language understanding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2012 IEEE
  • Conference_Location
    Miami, FL
  • Print_ISBN
    978-1-4673-5125-6
  • Electronic_ISBN
    978-1-4673-5124-9
  • Type

    conf

  • DOI
    10.1109/SLT.2012.6424225
  • Filename
    6424225