• DocumentCode
    1874231
  • Title

    Two-channel-based voice activity detection for humanoid robots in noisy home environments

  • Author

    Kim, Hyun-Don ; Komatani, Kazunori ; Ogata, Tetsuya ; Okuno, Hiroshi G.

  • Author_Institution
    Speech Media Process. Group, Kyoto Univ., Kyoto
  • fYear
    2008
  • fDate
    19-23 May 2008
  • Firstpage
    3495
  • Lastpage
    3501
  • Abstract
    The purpose of this research is to accurately classify the speech signals originating from the front even in noisy home environments. This ability can help robots to improve speech recognition and to spot keywords. We therefore developed a new voice activity detection (VAD) based on the complex spectrum circle centroid (CSCC) method. It can classify the speech signals that are received at the front of two microphones by comparing the spectral energy of observed signals with that of target signals estimated by CSCC. Also, it can work in real time without training filter coefficients beforehand even in noisy environments (SNR > 0 dB) and can cope with speech noises generated by audio-visual equipments such as televisions and audio devices. Since the CSCC method requires the directions of the noise signals, we also developed a sound source localization system integrated with cross-power spectrum phase (CSP) analysis and an expectation-maximization (EM) algorithm. This system was demonstrated to enable a robot to cope with multiple sound sources using two microphones.
  • Keywords
    expectation-maximisation algorithm; humanoid robots; noise (working environment); speech recognition; complex spectrum circle centroid method; cross-power spectrum phase; expectation-maximization algorithm; humanoid robots; microphones; noisy home environments; sound source localization system; speech noises; speech recognition; speech signal classification; two-channel-based voice activity detection; Acoustic noise; Filters; Humanoid robots; Microphones; Noise generators; Phase noise; Signal to noise ratio; Speech recognition; TV; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on
  • Conference_Location
    Pasadena, CA
  • ISSN
    1050-4729
  • Print_ISBN
    978-1-4244-1646-2
  • Electronic_ISBN
    1050-4729
  • Type

    conf

  • DOI
    10.1109/ROBOT.2008.4543745
  • Filename
    4543745