Title :
Fast approximate i-vector estimation using PCA
Author :
Omar, Mohamed Kamal
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
Abstract :
The i-vector representation has become increasingly popular in speaker and language recognition systems. The estimation of the projection matrix of the i-vector model is usually performed using the iterative expectation maximization (EM) algorithm. This work presents a novel approach to estimate the projection matrix of the i-vector representation and to estimate the i-vector representation for each utterance. In this approach, we formulate the estimation of the projection matrix as a principal component analysis (PCA) problem. Using the relation between PCA and a linear Gaussian model trained using the EM algorithm, we show that an approximate solution of the i-vector estimation can be obtained as the solution of a PCA problem. We evaluate the performance of our approximate i-vector estimation on the language recognition task of the robust automatic transcription of speech (RATS) project. The proposed approach reduces by 50% relative the computational time required to estimate the i-vector projection matrix and by 42% relative the computational time to estimate the i-vector representation compared to the standard EM-based approach to i-vector estimation. In addition, our experiments show improvements up to 29% relative in language recognition performance in terms of equal error rate compared to the standard EM-based i-vector estimation.
Keywords :
Gaussian processes; error statistics; expectation-maximisation algorithm; matrix algebra; principal component analysis; signal representation; speaker recognition; PCA problem; RATS project; equal error rate; fast approximate i-vector estimation; i-vector projection matrix estimation; i-vector representation; iterative EM algorithm; iterative expectation maximization algorithm; language recognition performance; language recognition system; linear Gaussian model; principal component analysis; robust automatic transcription of speech project; speaker recognition system; speech utterance; Approximation methods; Covariance matrices; Estimation; Mathematical model; Principal component analysis; Rats; Standards; EM algorithm; PCA; i-vector estimation; language recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location :
South Brisbane, QLD
DOI :
10.1109/ICASSP.2015.7178821