Title :
Semi-tied covariance matrices for hidden Markov models
Author_Institution :
Dept. of Eng., Cambridge Univ., UK
fDate :
5/1/1999 12:00:00 AM
Abstract :
There is normally a simple choice made in the form of the covariance matrix to be used with continuous-density HMMs. Either a diagonal covariance matrix is used, with the underlying assumption that elements of the feature vector are independent, or a full or block-diagonal matrix is used, where all or some of the correlations are explicitly modeled. Unfortunately when using full or block-diagonal covariance matrices there tends to be a dramatic increase in the number of parameters per Gaussian component, limiting the number of components which may be robustly estimated. This paper introduces a new form of covariance matrix which allows a few “full” covariance matrices to be shared over many distributions, whilst each distribution maintains its own “diagonal” covariance matrix. In contrast to other schemes which have hypothesized a similar form, this technique fits within the standard maximum-likelihood criterion used for training HMMs. The new form of covariance matrix is evaluated on a large-vocabulary speech-recognition task. In initial experiments the performance of the standard system was achieved using approximately half the number of parameters. Moreover, a 10% reduction in word error rate compared to a standard system can be achieved with less than a 1% increase in the number of parameters and little increase in recognition time
Keywords :
Gaussian processes; correlation methods; covariance matrices; hidden Markov models; maximum likelihood estimation; speech recognition; Gaussian component; block-diagonal covariance matrix; continuous-density HMM; correlations; distributions; experiments; feature vector; full diagonal covariance matrix; hidden Markov models; large-vocabulary speech-recognition task; performance; recognition time; semi-tied covariance matrices; standard maximum-likelihood criterion; training; word error rate reduction; Covariance matrix; Decorrelation; Discrete cosine transforms; Discrete transforms; Hidden Markov models; Linear discriminant analysis; Maximum likelihood estimation; Robustness; Speech recognition; Vectors;
Journal_Title :
Speech and Audio Processing, IEEE Transactions on