A comparative study of discriminative training using non-uniform criteria for cross-layer acoustic modeling

Author

Weng, Chao ; Juang, Biing-Hwang

Author_Institution

Center for Signal & Image Process., Georgia Inst. of Technol., Atlanta, GA, USA

fYear

2012

fDate

25-30 March 2012

Firstpage

4089

Lastpage

4092

Abstract

This work focuses on a comparative study of discriminative training using non-uniform criteria for cross-layer acoustic modeling. Two kinds of discriminative training (DT) frameworks, minimum classification error like (MCE-like) and minimum phone error like (MPE-like) DT frameworks, are augmented to allow the error cost embedding at the phoneme (model) level respectively. To facilitate this comparative study, we implement both augmented DT frameworks under the same umbrella, using the error cost derived from the same cross-layer confusion matrix. Experiments on a large vocabulary task WSJ0 demonstrated the effectiveness of both DT frameworks with the formulated non-uniform error cost embedded. Several preliminary investigations on the effect of the dynamic range of error cost are also presented.

Keywords

matrix algebra; speech recognition; comparative study; cross-layer acoustic modeling; cross-layer confusion matrix; discriminative training frameworks; large vocabulary task WSJ0; minimum classification error like DT frameworks; minimum phone error like DT frameworks; nonuniform criteria; nonuniform error cost embedded; speech recognition; Accuracy; Acoustics; Dynamic range; Hidden Markov models; Linear programming; Speech recognition; Training; cross-layer acoustic modeling; discriminative training; non-uniform error cost; speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on

Conference_Location

Kyoto

ISSN

1520-6149

Print_ISBN

978-1-4673-0045-2

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2012.6288817

Filename

6288817