Title :
Neural response based phoneme classification under noisy condition
Author :
Alam, Md Shamsul ; Jassim, Wissam A. ; Zilany, Muhammad S. A.
Author_Institution :
Dept. of Biomed. Eng., Univ. of Malaya, Kuala Lumpur, Malaysia
Abstract :
Human listeners are capable of recognizing speech in noisy environment, while most of the traditional speech recognition methods do not perform well in the presence of noise. Unlike traditional Mel-frequency cepstral coefficient (MFCC)-based method, this study proposes a phoneme classification technique using the neural responses of a physiologically-based computational model of the auditory periphery. Neurograms were constructed from the responses of the model auditory nerve to speech phonemes. The features of neurograms were used to train the recognition system using a Gaussian Mixture Model (GMM) classification technique. Performance was evaluated for different types of phonemes such as stops, fricatives and vowels from the TIMIT database for both under quiet and noisy conditions. Although performance of the proposed method is comparable with that of MFCC-based classifier in quiet condition, the neural response-based proposed method outperforms the traditional MFCC-based method under noisy conditions even with the use of less number of features in the proposed method. The proposed method could be used in the field of speech recognition such as speech to text application, especially under noisy conditions.
Keywords :
Gaussian processes; acoustic noise; cepstral analysis; hearing; mixture models; speech recognition; GMM classification technique; Gaussian mixture model; MFCC-based classifier; MFCC-based method; Mel-frequency cepstral coefficient; TIMIT database; auditory nerve; auditory periphery; human listeners; neural response; neurograms; noisy condition; phoneme classification technique; physiologically-based computational model; recognition system; speech phonemes; speech recognition methods; Accuracy; Computational modeling; Noise; Noise measurement; Robustness; Speech; Speech recognition; GMM; MFCC; auditory nerve model; neurogram; phoneme classification;
Conference_Titel :
Intelligent Signal Processing and Communication Systems (ISPACS), 2014 International Symposium on
Conference_Location :
Kuching
DOI :
10.1109/ISPACS.2014.7024447