DocumentCode
1649167
Title
Voice source waveforms for utterance level speaker identification using support vector machines
Author
Vandyke, David ; Wagner, Michael ; Goecke, Roland
Author_Institution
Univ. of Canberra, Canberra, ACT, Australia
fYear
2013
Firstpage
1
Lastpage
7
Abstract
The voice source waveform generated by the periodic motion of the vocal folds during voiced speech remains to be fully utilised in automatic speaker recognition systems. We perform closed-set speaker identification experiments on the YOHO speech corpus with the aim of continuing our investigation into the level of speaker discriminatory information present in a data driven parameterisation of the voice-source waveform obtained by closed-phase inverse filtering. Discriminatory modelling using support-vector-machines resulted in utterance level correct identification rates of 85.3% when using a multi-class model, and 72.5% when using a binary, one-against-all regression model, each on cohorts of 20 speakers respectively. These results compare well with other speaker identification experiments in the literature employing features derived from the voice source waveform, and are positive when observed under the hypothesis that they should be complementary to the common magnitude spectral parameters (mel-cepstra).
Keywords
cepstral analysis; regression analysis; speaker recognition; speech processing; support vector machines; waveform analysis; YOHO speech corpus; automatic speaker recognition systems; closed-phase inverse filtering; closed-set speaker identification experiments; common magnitude spectral parameters; data driven parameterisation; discriminatory modelling; mel-cepstra; multiclass model; one-against-all regression model; periodic vocal fold motion; speaker discriminatory information; support vector machines; utterance level correct identification rates; utterance level speaker identification; voice source waveforms; voiced speech; Principal component analysis; Probes; Speech; Speech recognition; Support vector machines; Testing; Training; Glottal Waveform; Speaker Identification; Voice Source;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Technology in Asia (CITA), 2013 8th International Conference on
Conference_Location
Kota Samarahan
Print_ISBN
978-1-4799-1091-5
Type
conf
DOI
10.1109/CITA.2013.6637568
Filename
6637568
Link To Document