DocumentCode
3124203
Title
Hierarchical clustering and robust identification for block-based autoregressive speech parameter estimation
Author
Ruofei Chen ; Cheung-Fat Chan
Author_Institution
Dept. of Electron. Eng., City Univ. of Hong Kong, Kowloon, China
fYear
2012
fDate
5-8 Dec. 2012
Firstpage
103
Lastpage
107
Abstract
Given accurate system parameters like state transition matrix F and corruption mapping matrix H, clean speech autoregressive (AR) parameters can be effectively estimated from a series of noisy observations with Kalman filtering. In this paper, we address several fundamental issues to improve the linear dynamical system (LDS) based AR parameter estimation. A hierarchical time series clustering scheme is devised to truly group speech blocks with similar trajectories and corruption types. In addition, a correlated robust identification scheme using a posteriori signal-to-noise (SNR) mask is proposed to improve the identification accuracy. The effectiveness of the proposed clustering and identification scheme is evaluated in terms of spectral distortion between the Kalman estimates and the true clean speech parameters. Significant improvement is observed over the original matrix quantization (MQ) based approach. The proposed scheme is also successfully applied in a model-based speech enhancement application, and is expected to be effective in various codebook driven speech applications for robust identification purpose.
Keywords
Kalman filters; autoregressive processes; matrix algebra; pattern clustering; speech enhancement; time series; Kalman filtering; LDS; MQ; SNR; a posteriori signal-to-noise mask; block-based autoregressive speech parameter estimation; clean speech autoregressive parameters; codebook driven speech applications; corruption mapping matrix; hierarchical time series clustering scheme; linear dynamical system based AR parameter estimation; matrix quantization based approach; model-based speech enhancement application; robust identification; state transition matrix; Estimation; Noise measurement; Signal to noise ratio; Speech; Trajectory; Vectors; autoregressive; clustering; identification; linear dynamical system; time series;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location
Kowloon
Print_ISBN
978-1-4673-2506-6
Electronic_ISBN
978-1-4673-2505-9
Type
conf
DOI
10.1109/ISCSLP.2012.6423482
Filename
6423482
Link To Document