DocumentCode :
1794695
Title :
Towards improving the performance of text/language independent speaker recognition systems
Author :
George, Kuruvachan K. ; Arunraj, K. ; Sreekumar, K.T. ; Kumar, C. Senthil ; Ramachandran, K.I.
Author_Institution :
Machine Intell. Res. Lab., Amrita Vishwa Vidyapeetham, Coimbatore, India
fYear :
2014
fDate :
6-11 Jan. 2014
Firstpage :
1
Lastpage :
6
Abstract :
Speaker Recognition is an active area of research for the last few decades for its applications in several national security, and other forensic applications. In this work, we present the details of a speaker recognition system developed using universal background model and support vector machines(UBM-SVM). We explored several techniques to improve the performance of the baseline system developed using mel frequency cepstral coefficients(MFCC) as input features. We developed and tested the speaker recognition system for 200 speakers, using the data collected over 13 different channels, such as handset regular phone, speaker phone, regular phone headphone, regular phone, etc. We experimented with the use of RelAtive SpecTrA (RASTA) processing, and feature warping on the input MFCC features, and nuisance attribute projection (NAP) on the Gaussian mixture model supervectors derived in the system. It was seen that these techniques have helped improve the system performance significantly by minimizing the effect of different channels on the system performance. The details of the system implementation and results are presented in this paper. The complete system is developed in MATLAB and C/C++.
Keywords :
Gaussian processes; speaker recognition; support vector machines; text analysis; Gaussian mixture model supervectors; MFCC; NAP; RASTA processing; UBM-SVM; baseline system; feature warping; forensic applications; handset regular phone; mel frequency cepstral coefficients; national security; nuisance attribute projection; regular phone; regular phone headphone; relative spectra; speaker phone; text-language independent speaker recognition systems; universal background model and support vector machines; Adaptation models; Feature extraction; Mel frequency cepstral coefficient; Speaker recognition; Speech; Support vector machines; Training data; Feature Warping; Gaussian mixture model; Maximum a Posteriori(MAP) Adaptation; RASTA filtering; Speaker Recognition System; VAD; nuisance attribute projection; support vector machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Power Signals Control and Computations (EPSCICON), 2014 International Conference on
Conference_Location :
Thrissur
Print_ISBN :
978-1-4799-3611-3
Type :
conf
DOI :
10.1109/EPSCICON.2014.6887506
Filename :
6887506
Link To Document :
بازگشت