• DocumentCode
    2060134
  • Title

    English Access to Structured Data

  • Author

    Richardson, Kyle D. ; Bobrow, Daniel G. ; Condoravdi, Cleo ; Waldinger, Richard ; Das, Amar

  • Author_Institution
    Palo Alto Res. Center, Palo Alto, CA, USA
  • fYear
    2011
  • fDate
    18-21 Sept. 2011
  • Firstpage
    13
  • Lastpage
    20
  • Abstract
    We present work on using a domain model to guide text interpretation, in the context of a project that aims to interpret English questions as a sequence of queries to be answered from structured databases. We adapt a broad-coverage and ambiguity-enabled natural language processing (NLP) system to produce domain-specific logical forms, using knowledge of the domain to zero in on the appropriate interpretation. The vocabulary of the logical forms is drawn from a domain theory that constitutes a higher-level abstraction of the contents of a set of related databases. The meanings of the terms are encoded in an axiomatic domain theory. To retrieve information from the databases, the logical forms must be instantiated by values constructed from fields in the database. The axiomatic domain theory is interpreted by the first-order theorem prover SNARK to identify the groundings, and then retrieve the values through procedural attachments semantically linked to the database. SNARK attempts to prove the logical form as a theorem by reasoning over the theory that is linked to the database and returns the exemplars of the proof(s) back to the user as answers to the query. The focus of this paper is more on the language task, however, we discuss the interaction that must occur between linguistic analysis and reasoning for an end-to-end natural language interface to databases. We illustrate the process using examples drawn from an HIV treatment domain, where the underlying databases are records of temporally bound treatments of individual patients.
  • Keywords
    computational linguistics; natural language processing; query processing; question answering (information retrieval); theorem proving; vocabulary; English access; English question; NLP system; ambiguity-enabled natural language processing; axiomatic domain theory; broad-coverage natural language processing; deductive question answering; domain-specific logical form; end-to-end natural language interface; first-order theorem prover SNARK; higher-level abstraction; language task; linguistic analysis; query; reasoning; structured database; vocabulary; Bridges; Cognition; Databases; Drugs; Natural languages; Pragmatics; Semantics; Deductive question answering; HIV drug resistance database; Natural language interfaces to databases; Natural language processing; Theorem proving;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on
  • Conference_Location
    Palo Alto, CA
  • Print_ISBN
    978-1-4577-1648-5
  • Electronic_ISBN
    978-0-7695-4492-2
  • Type

    conf

  • DOI
    10.1109/ICSC.2011.67
  • Filename
    6061430