• DocumentCode
    3306343
  • Title

    Pitch extraction in Human-Robot interaction

  • Author

    Heckmann, Martin ; Joublin, Frank ; Nakadai, Kazuhiro

  • Author_Institution
    Honda Res. Insitute Eur. GmbH, Offenbach/Main, Germany
  • fYear
    2010
  • fDate
    18-22 Oct. 2010
  • Firstpage
    1482
  • Lastpage
    1487
  • Abstract
    We present a system for real-time fundamental frequency, i. e. pitch, extraction on a humanoid robot. The system extracts pitch using an 8 channel microphone array mounted on the Honda humanoid robot in a realistic Human-Robot interaction scenario. The main building blocks of the system are a multi-channel signal enhancement followed by robust pitch extraction and tracking. The signal enhancement is based on 8 channel Geometric Source Separation. For the pitch extraction the signal is first transformed with a Gammatone filter bank into the frequency domain. Next a histogram of zero crossing distances is calculated from all filter bank signals. During the calculation of the histogram spurious side peaks resulting from harmonics and sub-harmonics of the true fundamental frequency are inhibited. The resulting histogram then serves as input to a grid based Bayesian tracker which deploys Bayesian filtering in a forward step and Bayesian smoothing in a backward step on a 100ms time window. We demonstrate the performance of the system in a scenario where male and female speakers utter different phrases while standing at a normal interaction distance to the robot. For the evaluation we compare the pitch tracking results once obtained from a clean headset signal and once from the signals obtained from the robot. The results show that the tracking performance only degrades to a small extent in the realistic interaction scenario compared to the headset recordings.
  • Keywords
    Bayes methods; channel bank filters; frequency-domain analysis; human-robot interaction; humanoid robots; microphone arrays; source separation; speech enhancement; 8 channel geometric source separation; 8 channel microphone array; Bayesian filtering; Gammatone filter bank; Honda humanoid robot; filter bank signal; grid based Bayesian tracker; multichannel signal enhancement; realistic human-robot interaction; robust pitch extraction; time window;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on
  • Conference_Location
    Taipei
  • ISSN
    2153-0858
  • Print_ISBN
    978-1-4244-6674-0
  • Type

    conf

  • DOI
    10.1109/IROS.2010.5649882
  • Filename
    5649882