Title :
Auditory-based robust speech recognition system for ambient assisted living in smart home
Author :
Hsien-Shun Kuo ; Po-Hsun Sung ; Sheng-Chieh Lee ; Ta-Wen Kuan ; Jhing-Fa Wang
Author_Institution :
Dept. of Electr. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
Abstract :
An auditory-based feature extraction algorithm is proposed for enhancing the robustness of automatic speech recognition. In the proposed approach, the speech signal is characterized using a new feature referred to as the Basilar-membrane Frequency-band Cepstral Coefficient (BFCC). In contrast to the conventional Mel-Frequency Cepstral Coefficient (MFCC) method based on a Fourier spectrogram, the proposed BFCC method uses an auditory spectrogram based on a gammachirp wavelet transform in order to more accurately mimic the auditory response of the human ear and improve the noise immunity. In addition, a Hidden Markov Model (HMM) is used for both training and testing purposes. The evaluation results obtained using the AURORA 2 noisy speech database show that compared to the MFCC method, the proposed scheme improves the speech recognition rate by 15% on average given speech samples with Siganl-to-Noise Ratios (SNRs) ranging from 0 to 20 dB. Thus, the proposed method has significant potential for the development of robust speech recognition systems for ambient assisted living.
Keywords :
assisted living; hidden Markov models; speech recognition; wavelet transforms; AURORA 2 noisy speech database; BFCC; Basilar-membrane frequency-band cepstral coefficient; HMM; MFCC; Mel-frequency cepstral coefficient method; SNR; ambient assisted living; auditory human ear response; auditory spectrogram; auditory-based feature extraction algorithm; auditory-based robust speech recognition system; automatic speech recognition; gammachirp wavelet transform; hidden Markov model; robust speech recognition systems; siganl-to-noise ratios; smart home; Mel frequency cepstral coefficient; Noise; Robustness; Speech; Speech recognition; Wavelet transforms; Ambient assisted living; auditory modeling; cepstral coefficients; gammachirp filterbank speech recognition;
Conference_Titel :
Orange Technologies (ICOT), 2014 IEEE International Conference on
Conference_Location :
Xian
DOI :
10.1109/ICOT.2014.6956626