• DocumentCode
    2329422
  • Title

    What is left to be understood in ATIS?

  • Author

    Tur, Gokhan ; Hakkani-Tur, Dilek ; Heck, Larry

  • Author_Institution
    Microsoft Res., Mountain View, CA, USA
  • fYear
    2010
  • fDate
    12-15 Dec. 2010
  • Firstpage
    19
  • Lastpage
    24
  • Abstract
    One of the main data resources used in many studies over the past two decades for spoken language understanding (SLU) research in spoken dialog systems is the airline travel information system (ATIS) corpus. Two primary tasks in SLU are intent determination (ID) and slot filling (SF). Recent studies reported error rates below 5% for both of these tasks employing discriminative machine learning techniques with the ATIS test set. While these low error rates may suggest that this task is close to being solved, further analysis reveals the continued utility of ATIS as a research corpus. In this paper, our goal is not experimenting with domain specific techniques or features which can help with the remaining SLU errors, but instead exploring methods to realize this utility via extensive error analysis. We conclude that even with such low error rates, ATIS test set still includes many unseen example categories and sequences, hence requires more data. Better yet, new annotated larger data sets from more complex tasks with realistic utterances can avoid over-tuning in terms of modeling and feature design. We believe that advancements in SLU can be achieved by having more naturally spoken data sets and employing more linguistically motivated features while preserving robustness due to speech recognition noise and variance due to natural language.
  • Keywords
    computational linguistics; error analysis; natural language processing; speech recognition; airline travel information system; data resources; discriminative machine learning; error analysis; error rates; intent determination; linguistically motivated features; natural language; naturally spoken data sets; slot filling; speech recognition noise; spoken dialog systems; spoken language understanding; ATIS; discriminative training; spoken language understanding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2010 IEEE
  • Conference_Location
    Berkeley, CA
  • Print_ISBN
    978-1-4244-7904-7
  • Electronic_ISBN
    978-1-4244-7902-3
  • Type

    conf

  • DOI
    10.1109/SLT.2010.5700816
  • Filename
    5700816