Title :
On the effectiveness of MFCCs and their statistical distribution properties in speaker identification
Author :
Molla, Md Khademul Islam ; Hirose, Keikichi
Author_Institution :
Dept. of Frontier Informatics, Tokyo Univ., Japan
Abstract :
This paper presents a study on the effectiveness of mel-frequency cepstrum coefficients (MFCCs) and some of their statistical distribution properties (skewness, kurtosis, standard deviation) as the features for text-dependent speaker identification. Multi-layer neural network with backpropagation learning algorithm is used here as the classification tool. The MFCCs representing the speaker characteristics of a speech segment are computed by nonlinear filterbank analysis and discrete cosine transform. The speaker identification efficiency and the convergence speed of the neural network are investigated for different combinations of the proposed features. The result shows that the first MFCC degrades the identification competence and statistical distribution parameters enhance the training speed of the neural network.
Keywords :
audio signal processing; backpropagation; cepstral analysis; discrete cosine transforms; neural nets; signal classification; speaker recognition; statistical distributions; MFCC; backpropagation learning algorithm; classification tool; discrete cosine transform; human speech production model; kurtosis; mel-frequency cepstrum coefficients; multilayer neural network; nonlinear filterbank analysis; skewness; speaker characteristics; speech segmentation; standard deviation; statistical distribution properties; text-dependent speaker identification; Backpropagation algorithms; Cepstral analysis; Cepstrum; Convergence; Discrete cosine transforms; Filter bank; Multi-layer neural network; Neural networks; Speech analysis; Statistical distributions;
Conference_Titel :
Virtual Environments, Human-Computer Interfaces and Measurement Systems, 2004. (VECIMS). 2004 IEEE Symposium on
Print_ISBN :
0-7803-8339-7
DOI :
10.1109/VECIMS.2004.1397204