Title :
Speaker recognition through nonstationary vector AR model
Author :
Xingxing Lu ; Wanchun Fei
Author_Institution :
Coll. of Textile & Clothing Eng., Soochow Univ., Suzhou, China
Abstract :
In the mean Mel cepstrum, which is obtained from analysis of speech signals by Fourier transform, the time-varying characteristic frequencies are extracted, two of which are selected to established time series composed by the characteristic frequency Mel cepstrum value series. Using the methods of time series pretreatment and mathematical statistics, their deterministic component and stochastic component are separated. Binary time series are composed of the two stochastic components. In order to further extract the speaker´s speech signal parameters, time-varying parameter vector AR (TVPVAR) model is established and analyzed. Using these parameters, the speakers are identified based on both stochastic components and the residuals of TVPVAR model. Speeches of 10 speakers, 100 speeches per speaker, are sampled, from which a speech is selected in turn to be recognized. Experiments show that: compared with the recognition rate (98.7%) based on the stochastic component, the recognition rate (99.6%) based on the residuals of TVPVAR model has improved. It proves that the TVPVAR model is effective to analyze autocovariance nonstationary vector time series.
Keywords :
Fourier transforms; cepstral analysis; feature extraction; speaker recognition; stochastic processes; time series; Fourier transform; autocovariance nonstationary vector time series; binary time series pretreatment; characteristic frequency Mel cepstrum value series; deterministic component; mathematical statistics; mean Mel cepstrum; nonstationary vector AR model; speaker recognition; speech signal analysis; stochastic component; time-varying characteristic frequency extraction; time-varying parameter vector AR model; Cepstrum; Speaker recognition; Speech; Speech recognition; Stochastic processes; Time frequency analysis; Time series analysis; TVPVAR model; mahalanobis distance; nonstationary time Series; speaker recognition;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Print_ISBN :
978-1-61284-180-9
DOI :
10.1109/FSKD.2011.6019557