Title :
Modified estimation of between-class covariance matrix in linear discriminant analysis of speech
Author :
Viszlay, Peter ; Juhar, Jozef ; Pleva, Matus
Author_Institution :
Dept. of Electron. & Multimedia Commun., Tech. Univ. of Kosice, Kosice, Slovakia
Abstract :
Linear discriminant analysis (LDA) is a popular supervised feature transformation applied in current automatic speech recognition (ASR). Generally, the parameters of LDA are computed from the training data partitioned into classes. If the number of classes is smaller than the dimension of the supervectors (typically in phoneme-based LDA) then the between-class covariance matrix can become singular or close to singular (singularity problem in classical LDA). In this paper, we present a modification of the standard between-class covariance matrix estimation, which represents one of the possible approaches to solving the singularity problem. Our method works directly with the supervectors instead of the class mean vectors. The number of estimation cycles is much larger because more data are used during the computation. Thus, the matrix structure can be significantly refined. This implies that larger lengths of context can be used while the singularity problem is efficiently eliminated. The effectiveness of the proposed estimation is evaluated in Slovak phoneme-based and triphone-based large vocabulary continuous speech recognition (LVCSR) task. The method is compared to the state-of-the-art MFCCs and to LDA trained in the standard way. The experimental results confirm that the modified LDA considerably outperforms the MFCCs and consistently leads to improvements of the conventional LDA.
Keywords :
covariance matrices; speech recognition; ASR; LDA; LVCSR; MFCC; Slovak phoneme-based large vocabulary continuous speech recognition task; automatic speech recognition; class mean vectors; linear discriminant speech analysis; modified between-class covariance matrix estimation; singularity problem; supervectors; supervised feature transformation; triphone-based large vocabulary continuous speech recognition task; Bismuth; Context; Covariance matrices; Estimation; Hidden Markov models; Training; Vectors;
Conference_Titel :
Systems, Signals and Image Processing (IWSSIP), 2013 20th International Conference on
Conference_Location :
Bucharest
Print_ISBN :
978-1-4799-0941-4
DOI :
10.1109/IWSSIP.2013.6623480