Title :
Auditory model representation for speaker recognition
Author :
Colombi, John ; Anderson, Timothy R. ; Rogers, Steven K. ; Ruck, Dennis W. ; Warhola, G.T.
Author_Institution :
AFIT/EN, Wright-Patterson AFB, OH, USA
Abstract :
An examination of the KING database that compares proven spectral processing techniques with an auditory model representation for speaker recognition is presented. The feature sets compared are LPC (linear predictive coding) cepstral coefficients and auditory nerve firing rates provided by the Payton model. The two feature sets were quantized by two clustering algorithms, a Linde-Buzo-Gray algorithm and a Kohonen self-organizing feature map. The resulting vector quantized distortion based classification indicates that the auditory model provides accuracies comparable with LPC cepstral in nonstudio quality environments and over multiple sessions. For a 10-speaker subset using only voiced frames of 15-s segments, both achieve over 80% identification rate. Cepstral performs better on verification tasks measured with receiver operating characteristics curves.<>
Keywords :
hearing; linear predictive coding; physiological models; self-organising feature maps; speech recognition; vector quantisation; KING database; Kohonen self-organizing feature map; Linde-Buzo-Gray algorithm; accuracies; auditory model representation; auditory nerve firing rates; cepstral coefficients; clustering algorithms; identification rate; linear predictive coding; speaker recognition; vector quantized distortion based classification; verification;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.1993.319407