DocumentCode
178062
Title
Generalization of temporal filter and linear transformation for robust speech recognition
Author
Duc Hoang Ha Nguyen ; Xiong Xiao ; Eng Siong Chng ; Haizhou Li
Author_Institution
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
fYear
2014
fDate
4-9 May 2014
Firstpage
1730
Lastpage
1734
Abstract
Temporal filtering of feature trajectories and linear transformation of feature vectors are two effective ways to compensate the speech features to achieve robust speech recognition in noisy and reverberant environments. In the previous studies, as the two methods are usually applied in sequence, the interaction between the two methods is not optimized. In this paper, we propose a generalized transform which integrates temporal filter and linear transformation into a single process. The new transform parameters are optimized to minimize an approximated Kullback-Leibler divergence between the distribution of the compensated features and the distribution represented by a clean reference model. The proposed method is evaluated on the Aurora-5 clean condition training task. The experiments show that the generalized transform significantly outperforms the simple cascade of temporal filtering and linear transformation. For example, the word accuracy is improved from 81.55% (cascade) to 83.99% (generalized) and from 72.09% (cascade) to 76.04% (generalized) for office and living room environments, respectively, in speaker based feature adaptation scheme.
Keywords
filtering theory; speech recognition; transforms; Aurora-5 clean condition training task; approximated Kullback-Leibler divergence; clean reference model; compensated feature distribution; feature trajectory; feature vectors; generalized transform; linear transformation; noisy environments; reverberant environments; robust speech recognition; speaker based feature adaptation scheme; temporal filter generalization; Acoustics; Robustness; Speech; Speech processing; Speech recognition; Transforms; Vectors; Kullback-Leibler divergence; Robust speech recognition; linear transformation; reverberant speech recognition; temporal filter;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6853894
Filename
6853894
Link To Document