• DocumentCode
    2363019
  • Title

    From science fiction to science fact: A Smart-House interface using speech technology and a photo-realistic avatar

  • Author

    Moir, T.J. ; Filho, G.L.

  • Author_Institution
    Sch. of Eng. & Adv. Technol., Massey Univ., Auckland
  • fYear
    2008
  • fDate
    2-4 Dec. 2008
  • Firstpage
    327
  • Lastpage
    333
  • Abstract
    This paper explores the problems of speech recognition in a (sometimes) noisy environment. An adaptive acoustic beamformer is proposed based on the Griffiths-Jim method and a "hot-spot" where speech can be received within a geometric defined boundary and rejected outside of it will be shown to give a certain amount of noise immunity and improve the signal-to-noise ratio for the second stage, which is the speech recognition engine. The recognition engine used has a limited vocabulary which gives rise to an excellent hit-rate and less training than unlimited vocabulary. Limited vocabulary is sufficient for a good many applications where devices are switched in a Boolean form for lighting, TV, radio etc. In addition to the speech recognition, good quality speech synthesis is also necessary to feedback information about the house to the end-user. The technology here has improved vastly within the last decade and will be shown that by using a head and shoulders avatar that is both photo-realistic and with appealing personality, that the experience of a speech interface is vastly enhanced. The paper explores these technologies and investigate the convergence of many of them in the current Massey smart-office.
  • Keywords
    avatars; home computing; speech recognition; speech synthesis; user interfaces; vocabulary; Boolean form; Griffiths-Jim method; Massey smart-office; adaptive acoustic beamformer; geometric defined boundary; head avatar; hot-spot; noise immunity; photo-realistic avatar; shoulders avatar; signal-to-noise ratio; smart-house interface; speech recognition engine; speech synthesis; vocabulary; Acoustic noise; Avatars; Engines; Signal to noise ratio; Speech enhancement; Speech recognition; Speech synthesis; TV; Vocabulary; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mechatronics and Machine Vision in Practice, 2008. M2VIP 2008. 15th International Conference on
  • Conference_Location
    Auckland
  • Print_ISBN
    978-1-4244-3779-5
  • Electronic_ISBN
    978-0-473-13532-4
  • Type

    conf

  • DOI
    10.1109/MMVIP.2008.4749555
  • Filename
    4749555