• DocumentCode
    3591791
  • Title

    Using vision, acoustics, and natural language for disambiguation

  • Author

    Fransen, Benjamin ; Morariu, Vlad ; Martinson, Eric ; Blisard, Samuel ; Marge, Matthew ; Thomas, Scott ; Schultz, Alan ; Perzanowski, Dennis

  • Author_Institution
    Naval Res. Lab., Washington, DC, USA
  • fYear
    2007
  • Firstpage
    73
  • Lastpage
    80
  • Abstract
    Creating a human-robot interface is a daunting experience. Capabilities and functionalities of the interface are dependent on the robustness of many different sensor and input modalities. For example, object recognition poses problems for state-of-the-art vision systems. Speech recognition in noisy environments remains problematic for acoustic systems. Natural language understanding and dialog are often limited to specific domains and baffled by ambiguous or novel utterances. Plans based on domain-specific tasks limit the applicability of dialog managers. The types of sensors used limit spatial knowledge and understanding, and constrain cognitive issues, such as perspective-taking. In this research, we are integrating several modalities, such as vision, audition, and natural language understanding to leverage the existing strengths of each modality and overcome individual weaknesses. We are using visual, acoustic, and linguistic inputs in various combinations to solve such problems as the disambiguation of referents (objects in the environment), localization of human speakers, and determination of the source of utterances and appropriateness of responses when humans and robots interact. For this research, we limit our consideration to the interaction of two humans and one robot in a retrieval scenario. This paper will describe the system and integration of the various modules prior to future testing.
  • Keywords
    human-robot interaction; natural language processing; robot vision; speech recognition; acoustic systems; cognitive issues; dialog managers; domain-specific tasks; human-robot interface; input modalities; natural language understanding; noisy environments; object recognition; perspective-taking; retrieval scenario; sensor modalities; speech recognition; state-of-the-art vision systems; utterances; Abstracts; Acoustics; Microphones; Resource management; Robots; Tracking; Training; Acoustics; Artificial Intelligence; Auditory Perspective-Taking; Dialog; Human-Robot Interaction; Natural Language Understanding; Spatial Reasoning; Vision;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Human-Robot Interaction (HRI), 2007 2nd ACM/IEEE International Conference on
  • ISSN
    2167-2121
  • Print_ISBN
    978-1-59593-617-2
  • Type

    conf

  • Filename
    6251719