Multilevel sampling and aggregation for discriminative training

Author

Yunxin Zhao ; Tuo Zhao ; Xin Chen

Author_Institution

Dept. of Comput. Sci., Univ. of Missouri, Columbia, MO, USA

fYear

2014

fDate

12-14 Sept. 2014

Firstpage

93

Lastpage

97

Abstract

We propose to use data sampling in the extended Baum-Welch (EBW) algorithm for maximum mutual information (MMI) based estimation of speech acoustic models, and to randomize the configurations of the sampled training sets and aggregate the numerator and denominator sufficient statistics for improving model robustness. We further combine data sampling based ensemble acoustic modeling with the data sampling based EBW, forming a two-level data sampling mechanism for acoustic model training. We conducted experiments on a telehealth conversational speech recognition task, where the two-level data sampling mechanism gave a statistically significant, absolute word accuracy gain of 3.56% over the conventional MMI baseline, corresponding to a 19.44% relative word error rate reduction.

Keywords

acoustic signal processing; estimation theory; signal sampling; speech recognition; telemedicine; acoustic model training; data sampling based EBW; data sampling based ensemble acoustic modeling; denominator sufficient statistics; discriminative training; extended Baum-Welch algorithm; maximum mutual information based estimation; multilevel aggregation; multilevel sampling; numerator sufficient statistics; sampled training sets; speech acoustic models; telehealth conversational speech recognition task; two-level data sampling mechanism; Accuracy; Acoustics; Computational modeling; Data models; Hidden Markov models; Maximum likelihood estimation; Training; data sampling; discriminative training; ensemble acoustic model; extended Baum-Welch algorithm; speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on

Conference_Location

Singapore

Type

conf

DOI

10.1109/ISCSLP.2014.6936677

Filename

6936677