DocumentCode :
1931962
Title :
Speaker age interval and sex identification based on Jitters, Shimmers and Mean MFCC using supervised and unsupervised discriminative classification methods
Author :
Sadeghi Naini, A. ; Homayounpour, M.M.
Author_Institution :
Dept. of Comput. Eng. and Inf. Technol., Amirkabir Univ. of Technol., Tehran, Iran
Volume :
1
fYear :
2006
fDate :
16-20 Nov. 2006
Abstract :
Discrimination ability of speech long term features, including jitters, shimmers and mean MFCC is proposed, for age interval and sex identification. First to make a primary study of discrimination ability, two well-known unsupervised classification methods, i.e. k-means and FCM, were used. Then, two supervised discriminative classification approaches, namely MLP neural network and SVM, have been employed for more precise age interval and sex identification. In addition, in order to make a study of mutual influences of age interval and sex discriminative features, a cascade combination of two MLPs neural networks, with one trained for age interval and other one for sex identification, has been utilized separately. Most practical applications of age interval and sex identification are remote applications where usually speech signal is affected by telecommunication channels. To take this affect into consideration, a telephonic database has been used in experiments. Obtained results demonstrate that jitter and shimmer have good discrimination ability between male and female or young and old speakers, but do not discriminate small age intervals appropriately. On the other hand, mean MFCC is not suitable for sex unsupervised classification but leads to an increase in sex supervised classification performance. Also these coefficients contain useful information about speaker age interval, and can result in a decrease in identification error rate.
Keywords :
multilayer perceptrons; speaker recognition; speech processing; support vector machines; unsupervised learning; MLP neural network; SVM; jitters; mean MFCC; sex identification; shimmers; speaker age interval; supervised discriminative classification; telephonic database; unsupervised discriminative classification methods; Communication channels; Databases; Error analysis; Jitter; Mel frequency cepstral coefficient; Neural networks; Signal processing; Speech; Support vector machine classification; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing, 2006 8th International Conference on
Conference_Location :
Beijing
Print_ISBN :
0-7803-9736-3
Electronic_ISBN :
0-7803-9736-3
Type :
conf
DOI :
10.1109/ICOSP.2006.345516
Filename :
4128931
Link To Document :
بازگشت