• DocumentCode
    3499738
  • Title

    A speech-centric perspective for human-computer interface

  • Author

    Deng, L. ; Acero, A. ; Wang, Y. ; Wang, K. ; Hon, H. ; Droppo, J. ; Mahajan, M. ; Huang, X.D.

  • Author_Institution
    Microsoft Res., Redmond, WA, USA
  • fYear
    2002
  • fDate
    9-11 Dec. 2002
  • Firstpage
    263
  • Lastpage
    267
  • Abstract
    Speech technology has been playing a central role in enhancing human-machine interactions, especially for small devices for which GUI has obvious limitations. The speech-centric perspective for human-computer interface advanced in this paper derives from the view that speech is the only natural and expressive modality to enable people to access information from and to interact with any device. In this paper, we describe the work conducted at Microsoft Research, in the project codenamed Dr.Who, aimed at the development of enabling technologies for speech-centric multimodal human-computer interaction. In particular, we present MiPad as the first Dr.Who´s application that addresses specifically the mobile user interaction scenario. MiPad is a wireless mobile PDA prototype that enables users to accomplish many common tasks using a multimodal spoken language interface and wireless-data technologies. It fully integrates continuous speech recognition and spoken language understanding, and provides a novel solution to the current prevailing problem of pecking with tiny styluses or typing on minuscule keyboards in today´s PDAs or smart phones.
  • Keywords
    linguistics; mobile computing; notebook computers; speech processing; speech recognition; speech-based user interfaces; Dr.Who project; GUI; MiPad; Microsoft research; acoustic modeling; back channel communication; graphic user interface; human-computer interface; knowledge representation; language modeling; man-machine interaction; mobile user interaction scenario; multimodal spoken language interface; personal digital assistant; robust speech recognition; smart phones; speech technology; speech-centric perspective; spoken language understanding; tap & talk multimodal interaction; wireless mobile PDA prototype; Graphical user interfaces; Keyboards; Man machine systems; Microphones; Natural languages; Personal digital assistants; Prototypes; Speech enhancement; Speech processing; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Signal Processing, 2002 IEEE Workshop on
  • Print_ISBN
    0-7803-7713-3
  • Type

    conf

  • DOI
    10.1109/MMSP.2002.1203296
  • Filename
    1203296