Title :
Speaker recognition model using two-dimensional mel-cepstrum and predictive neural network
Author :
Kitamura, Tadashi ; Takei, Shinsai
Author_Institution :
Dept. of Intelligence & Comput. Sci., Nagoya Inst. of Technol., Japan
Abstract :
Describes a speaker recognition model using a two-dimensional mel-cepstrum (TDMC) and a predictive neural network. The speaker model consists of two networks. The first one is a self-organizing vector quantization (VQ) map (Kohonen feature map). The second one is a predictive network, and it learns transitional patterns on the feature map of each speaker´s model. The TDMC consists of the averaged features and the dynamic features of the 2D mel-log spectra in the analyzed interval. The measure for speaker recognition is obtained by using a combination of the VQ distortion on the feature map and the prediction error on the predictive network. In this study, text-independent speaker identification experiments for eight speakers were carried out. The experimental results have shown that a combination of a feature map and a predictive network is very effective, and that the proposed model using a TDMC shows robustness for the time interval
Keywords :
cepstral analysis; prediction theory; self-organising feature maps; speaker recognition; vector quantisation; 2D mel-cepstrum; 2D mel-log spectra; Kohonen feature map; averaged features; distortion; dynamic features; pattern learning; prediction error; predictive neural network; robustness; self-organizing vector quantization map; speaker recognition model; text-independent speaker identification; time interval; Distortion measurement; Hidden Markov models; Leg; Measurement units; Neural networks; Predictive models; Robustness; Speaker recognition; Speech recognition; Training data;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607972