Title :
Non-negative subspace projection during conventional MFCC feature extraction for noise robust speech recognition
Author :
Pavan Kumar, D.S. ; Bilgi, Raghavendra R. ; Umesh, S.
Author_Institution :
Department of Electrical Engineering, Indian Institute of Technology Madras, Chennai 600036, India
Abstract :
An additional feature processing algorithm using Non-negative Matrix Factorization (NMF) is proposed to be included during the conventional extraction of Mel-frequency cepstral coefficients (MFCC) for achieving noise robustness in HMM based speech recognition. The proposed approach reconstructs log-Mel filterbank outputs of speech data from a set of building blocks that form the bases of a speech subspace. The bases are learned using the standard NMF of training data. A variation of learning the bases is proposed, which uses histogram equalized activation coefficients during training, to achieve noise robustness. The proposed methods give up to 5.96% absolute improvement in recognition accuracy on Aurora-2 task over a baseline with standard MFCCs, and up to 13.69% improvement when combined with other feature normalization techniques like Histogram Equalization (HEQ) and Heteroscedastic Linear Discriminant Analysis (HLDA).
Keywords :
Hidden Markov models; Manganese; Matrix decomposition; Robustness; Signal to noise ratio; Speech; Mel-frequency cepstral coefficients; Speech recognition; noise robustness; nonnegative matrix factorization;
Conference_Titel :
Communications (NCC), 2013 National Conference on
Conference_Location :
New Delhi, India
Print_ISBN :
978-1-4673-5950-4
Electronic_ISBN :
978-1-4673-5951-1
DOI :
10.1109/NCC.2013.6487993