Derivation of eigentriphones by weighted principal component analysis

Author

Ko, Tom ; Mak, Brian

Author_Institution

Dept. of Comput. Sci. & Eng., Hong Kong Univ. of Sci. & Technol., Clear Water Bay, China

fYear

2012

fDate

25-30 March 2012

Firstpage

4097

Lastpage

4100

Abstract

Last year we proposed a new acoustic modeling method called eigentriphones in which all triphones are distinct (with no tied states) so that they may be more discriminative. In our method, frequent triphones are used to derive an eigenbasis using PCA, and the infrequent triphones are then “adapted” as a linear combination of the eigenvectors which are also called eigentriphones. Although the eigentriphones method compares favorably with traditional tied-state triphones, the PCA procedure has two limitations: (1) only the frequent triphones are employed, and (2) they are considered “equal” even though some are more robust than the others. In this paper, weighted PCA is proposed to solve both problems so that all triphones-frequent and infrequent triphones-may contribute to the derivation of the eigentriphones, each at a different extent depending on its sample count. Experimental evaluation on the WSJ 5Kvocabulary speech recognition task shows that weighted PCA produces better models than simple PCA, and its performance is fairly independent of the number of eigentriphones once more than 20% of them are used. As a consequence, all triphones may be represented by fewer eigentriphones, resulting in a more compact model.

Keywords

eigenvalues and eigenfunctions; principal component analysis; speech recognition; vocabulary; PCA; WSJ 5K-vocabulary speech recognition task; eigenbasis; eigentriphone derivation method; eigenvectors; frequent triphones; infrequent triphones; linear combination; weighted principal component analysis; Acoustics; Adaptation models; Hidden Markov models; Principal component analysis; Robustness; Speech recognition; Training; Eigentriphones; context-dependent acoustic modeling; eigenvoice adaptation; weighted PCA;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location

Kyoto

ISSN

1520-6149

Print_ISBN

978-1-4673-0045-2

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2012.6288819

Filename

6288819