مرکز منطقه ای اطلاع رساني علوم و فناوري - A robust high resolution speaker DOA estimation under reverberant environment

DocumentCode :

134305

Title :

A robust high resolution speaker DOA estimation under reverberant environment

Author :

Yifan Guo ; Zou, Y.X. ; Yongqing Wang

Author_Institution :

Sch. of Electron. & Comput. Eng., Peking Univ., Shenzhen, China

fYear :

2014

fDate :

12-14 Sept. 2014

Firstpage :

400

Lastpage :

400

Abstract :

Summary form given. Direction of arrival (DOA) estimation of the spatial speech source is a key technique in the audition system of the service robot. This paper investigates a robust high resolution speaker DOA estimation based on acoustic vector sensor (AVS) and spatial sparsity representation (SSR) theory of source. The approximate model of the inter-sensor data ratio (ISDR) of AVS in the time-frequency (TF) domain is derived with reverberation and noise, which determines the relationship between the AVS manifold vector and the ISDR. To obtain a robust speaker DOA estimation, the paper gets reliable high local signal-to-noise ratio (HLSNR) TF points by extracting the pitch of speech signal and fitting the curve. Then the SSR model of DOA estimation is formulated and the high DOA estimation accuracy is achieved. The experimental results under different reverberation and additive noise conditions show that the proposed DOA estimation method is able to achieve RMSE of below 0.5° when the SNR is from 5dB to 30dB. Moreover, the method is independent of the source frequencies and not sensitive to reverberation. Since AVS has a small size and few sensors, this DOA estimation approach will probably provide solutions for the speaker source DOA estimation of service robots in the natural home environment.

Keywords :

acoustic signal processing; curve fitting; direction-of-arrival estimation; mean square error methods; reverberation; service robots; signal representation; signal resolution; speaker recognition; time-frequency analysis; AVS manifold vector; HLSNR TF points; ISDR; RMSE; SNR; SSR theory; TF domain; acoustic vector sensor; additive noise condition; approximate model; audition system; curve fitting; direction-of-arrival estimation; high-local signal-to-noise ratio; intersensor data ratio; natural home environment; pitch extraction; reverberant environment; reverberation condition; robust high-resolution speaker DOA estimation; service robot; source frequencies; spatial sparsity representation theory; spatial speech source; time-frequency domain; Direction-of-arrival estimation; Estimation; Reverberation; Robot sensing systems; Robustness; Spatial resolution; Vectors; acoustic vector sensor; direction of arrival estimation; inter-sensor data ratio; spatial sparse representation; time-frequency sparsity;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on

Conference_Location :

Singapore

Type :

conf

DOI :

10.1109/ISCSLP.2014.6936698

Filename :

6936698

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=134305