• DocumentCode
    3339862
  • Title

    MiPad: a multimodal interaction prototype

  • Author

    Huang, X. ; Acero, A. ; Chelba, C. ; Deng, L. ; Droppo, J. ; Duchene, D. ; Goodman, J. ; Hon, H. ; Jacoby, D. ; Jiang, L. ; Loynd, R. ; Mahajan, M. ; Mau, P. ; Meredith, S. ; Mughal, S. ; Neto, S. ; Plumpe, M. ; Steury, K. ; Venolia, G. ; Wang, K. ; Wang,

  • Author_Institution
    Speech Technol. Group, Microsoft Res., Redmond, WA, USA
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    9
  • Abstract
    Dr. Who is a Microsoft research project aiming at creating a speech-centric multimodal interaction framework, which serves as the foundation for the NET natural user interface. MiPad is the application prototype that demonstrates compelling user advantages for wireless personal digital assistant (PDA) devices, MiPad fully integrates continuous speech recognition (CSR) and spoken language understanding (SLU) to enable users to accomplish many common tasks using a multimodal interface and wireless technologies. It tries to solve the problem of pecking with tiny styluses or typing on minuscule keyboards in today´s PDAs. Unlike a cellular phone, MiPad avoids speech-only interaction. It incorporates a built-in microphone that activates whenever a field is selected. As a user taps the screen or uses a built in roller to navigate, the tapping action narrows the number of possible instructions for spoken word understanding. MiPad currently runs on a Windows CE Pocket PC with a Windows 2000 machine where speech recognition is performed. The Dr Who CSR engine uses a unified CFG and n-gram language model. The Dr Who SLU engine is based on a robust chart parser and a plan-based dialog manager. The paper discusses MiPad´s design, implementation work in progress, and preliminary user study in comparison to the existing pen-based PDA interface
  • Keywords
    client-server systems; natural language interfaces; notebook computers; speech recognition; speech-based user interfaces; Dr. Who; MiPad; Microsoft research project; NET natural user interface; Windows 2000 machine; Windows CE Pocket PC; continuous speech recognition; multimodal interaction prototype; n-gram language model; plan-based dialog manager; robust chart parser; speech-centric multimodal interaction framework; spoken language understanding; tapping action; wireless personal digital assistant devices; wireless technologies; Cellular phones; Engines; Keyboards; Microphones; Natural languages; Navigation; Personal digital assistants; Prototypes; Speech recognition; User interfaces;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.940754
  • Filename
    940754