Title :
Unseen noise robust speech recognition using adaptive piecewise linear transformation
Author :
Chijiiwa, Keigo ; Suzuki, Masayuki ; Minematsu, Nobuaki ; Hirose, Keikichi
Author_Institution :
Univ. of Tokyo, Tokyo, Japan
Abstract :
SPLICE is one of the speech enhancement methods based on feature conversion, which shows a high performance with a relatively small amount of calculation. After modeling noisy speech features as GMM, conversion functions are obtained for individual GMM components. The original SPLICE estimates clean feature vectors as a weighted summation of the converted versions of input vectors. Since the conversion functions are determined and fixed only by using training data, the effectiveness of the original SPLICE will be lower in the case of unseen noisy environments. In this paper, we propose a novel method to adapt the conversion functions to work well in unseen environments. First, to realize adaptive conversion functions, we characterize those functions using their super vectors. Then, we conduct PCA on the super vectors to reduce the number of parameters to be adapted. By representing the super vectors through their PCA-based base functions and weights, we implement an efficient adaptation method of conversion functions, which we call Eigen-SPLICE here after. Evaluation experiments show that Eigen-SPLICE has reduced word error rate by 21.0% relative to the conventional SPLICE, and by 24.1% relative to EMS SPLICE in the test set B of the AURORA-2 task.
Keywords :
Gaussian processes; principal component analysis; speech enhancement; speech recognition; vectors; AURORA-2 task; GMM; Gaussian mixture model; PCA; adaptive piecewise linear transformation; base function; eigen-SPLICE; feature conversion function; noisy speech feature modeling; speech enhancement method; unseen noise robust speech recognition; vector; weighted summation; Environmental management; Indexes; Noise measurement; Speech; Speech recognition; Training data; Vectors; Noise robust; Piecewise linear techniques; Principal component analysis; SPLICE; Speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288867