• DocumentCode
    1652174
  • Title

    Connecting concepts from vision and speech processing

  • Author

    Wachsmuth, Sven ; Sagerer, Gerhard

  • Author_Institution
    Fac. of Technol., Bielefeld Univ., Germany
  • fYear
    1999
  • fDate
    6/21/1905 12:00:00 AM
  • Firstpage
    1
  • Lastpage
    19
  • Abstract
    This paper addresses the problem of how to establish referential links between interpretations of speech and visual data. In order to get rid of erroneous, vague, or incomplete conceptual descriptions, we propose a probabilistic interaction scheme. The modelling of dependencies and the calculation of inferences are realized by using Bayesian networks. This interaction scheme provides a basis for disambiguation and error recovery. We implemented an interaction component in an assembly task environment. A robot constructor can be instructed by speech and pointing gestures in order to connect primitive component parts of a wooden toy construction kit. The system is evaluated on a test data set which consists of 448 spoken utterances from 16 speakers who name objects on 10 images from different scenes. First results show the effectiveness and robustness of the probabilistic approach
  • Keywords
    belief networks; speech processing; Bayesian networks; assembly task environment; disambiguation; error recovery; inferences; pointing gestures; probabilistic interaction scheme; referential links; speech data; speech processing; test data set; visual data; Artificial intelligence; Bayesian methods; Information resources; Joining processes; Layout; Robotic assembly; Robots; Robustness; Speech processing; System testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Integration of Speech and Image Understanding, 1999. Proceedings
  • Conference_Location
    Corfu
  • Print_ISBN
    0-7695-0471-X
  • Type

    conf

  • DOI
    10.1109/ISIU.1999.824829
  • Filename
    824829