Title :
Binary quantization of feature vectors for robust text-independent speaker identification
Author :
Yuan, Zhong-Xuan ; Xu, Bo-Ling ; Yu, Chong-Zhi
Author_Institution :
Inst. of Acoust., Nanjing Univ., China
fDate :
1/1/1999 12:00:00 AM
Abstract :
We present a novel approach to vector quantization in which a feature vector is represented by a binary vector. It is called binary quantization (BQ). The performance criterion of vector quantization, distortion (distance) measure, was employed for investigating the effectiveness of BQ. At 12 b/analysis frame, the average distortion caused by BQ is even lower than the intraspeaker average distance between two repetitions of the same word (after DTW alignment). Since the output of BQ is a binary sequence, it is possible to combine it with a forward Hamming net classifier. In terms of the idea of a hierarchical model for describing a speaker individual characteristics, a text-independent speaker identification system was set up. Experimental results show that the performance of this system is very good. Not only are the small memory space and little computation required, in the speaker identification system, but, more importantly, it shows strong robustness in additive Gaussian white noise
Keywords :
AWGN; binary sequences; feature extraction; pattern classification; speaker recognition; speech coding; vector quantisation; DTW alignment; additive Gaussian white noise; average distortion; binary quantization; binary sequence; binary vector; distance measure; distortion measure; experimental results; feature vectors; forward Hamming net classifier; hierarchical model; intraspeaker average distance; memory space; performance; robust text-independent speaker identification; speaker individual characteristics; speech coding; vector quantization; Acoustics; Associate members; Binary sequences; Clustering algorithms; Distortion measurement; Hidden Markov models; Robustness; Speaker recognition; Speech recognition; Vector quantization;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on