Title :
Performance comparison of speaker and emotion recognition
Author :
Revathy, A. ; Shanmugapriya, P. ; Mohan, V.
Author_Institution :
Dept. of ECE, Saranathan Coll. of Eng., Trichy, India
Abstract :
This paper discusses the effectiveness on the use of Hidden Markov Model tool kit (HTK) for recognizing speech, speaker and emotion from the emotional speeches using Mel frequency cepstral coefficients (MFCC) as a feature. Emotion independent speech recognition, speaker independent speech recognition, emotion independent speaker recognition and speaker independent emotion recognition systems were proposed and their performances are analyzed. EMO-DB database is used in this work. 80% of the data is used for training and 20% of the data is used for testing. This system provides the average accuracy of 100%, 97%, 90% and 68% for speaker independent speech recognition, emotion independent speech recognition, speaker recognition and emotion recognition respectively. Since HTK based system has given good results for emotional speech recognition, speaker independent and emotion independent emotional speech recognition system is evaluated for noisy test speeches also. Accuracy of the system is improved if the additional preprocessing technique for noise reduction is used prior to conventional preprocessing. Volvo noise, white noise and F16 noise are the noises considered for evaluating the performance of the emotion independent and speaker independent emotional speech recognition system in noisy environment.
Keywords :
cepstral analysis; emotion recognition; feature extraction; hidden Markov models; interference suppression; speaker recognition; white noise; EMO-DB; F16 noise; HTK; MFCC; Mel frequency cepstral coefficients; emotion independent speaker recognition systems; emotion independent speech recognition systems; emotional speech recognition system; hidden Markov model tool kit; noise reduction; speaker independent emotion recognition systems; speaker independent speech recognition systems; volvo noise; white noise; Cepstrum; Emotion recognition; Indexes; Mel frequency cepstral coefficient; Noise measurement; Speech; Speech recognition; Emotion recognition; HTK; Mel frequency cepstral coefficients (MFCC); Noise; Short-time energy; Speaker recognition; Speech recognition; Zero crossing rate; recursive least square (RLS) filter;
Conference_Titel :
Signal Processing, Communication and Networking (ICSCN), 2015 3rd International Conference on
Conference_Location :
Chennai
Print_ISBN :
978-1-4673-6822-3
DOI :
10.1109/ICSCN.2015.7219844