• DocumentCode
    134305
  • Title

    A robust high resolution speaker DOA estimation under reverberant environment

  • Author

    Yifan Guo ; Zou, Y.X. ; Yongqing Wang

  • Author_Institution
    Sch. of Electron. & Comput. Eng., Peking Univ., Shenzhen, China
  • fYear
    2014
  • fDate
    12-14 Sept. 2014
  • Firstpage
    400
  • Lastpage
    400
  • Abstract
    Summary form given. Direction of arrival (DOA) estimation of the spatial speech source is a key technique in the audition system of the service robot. This paper investigates a robust high resolution speaker DOA estimation based on acoustic vector sensor (AVS) and spatial sparsity representation (SSR) theory of source. The approximate model of the inter-sensor data ratio (ISDR) of AVS in the time-frequency (TF) domain is derived with reverberation and noise, which determines the relationship between the AVS manifold vector and the ISDR. To obtain a robust speaker DOA estimation, the paper gets reliable high local signal-to-noise ratio (HLSNR) TF points by extracting the pitch of speech signal and fitting the curve. Then the SSR model of DOA estimation is formulated and the high DOA estimation accuracy is achieved. The experimental results under different reverberation and additive noise conditions show that the proposed DOA estimation method is able to achieve RMSE of below 0.5° when the SNR is from 5dB to 30dB. Moreover, the method is independent of the source frequencies and not sensitive to reverberation. Since AVS has a small size and few sensors, this DOA estimation approach will probably provide solutions for the speaker source DOA estimation of service robots in the natural home environment.
  • Keywords
    acoustic signal processing; curve fitting; direction-of-arrival estimation; mean square error methods; reverberation; service robots; signal representation; signal resolution; speaker recognition; time-frequency analysis; AVS manifold vector; HLSNR TF points; ISDR; RMSE; SNR; SSR theory; TF domain; acoustic vector sensor; additive noise condition; approximate model; audition system; curve fitting; direction-of-arrival estimation; high-local signal-to-noise ratio; intersensor data ratio; natural home environment; pitch extraction; reverberant environment; reverberation condition; robust high-resolution speaker DOA estimation; service robot; source frequencies; spatial sparsity representation theory; spatial speech source; time-frequency domain; Direction-of-arrival estimation; Estimation; Reverberation; Robot sensing systems; Robustness; Spatial resolution; Vectors; acoustic vector sensor; direction of arrival estimation; inter-sensor data ratio; spatial sparse representation; time-frequency sparsity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
  • Conference_Location
    Singapore
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2014.6936698
  • Filename
    6936698