Recent development of discriminative training using non-uniform criteria for cross-level acoustic modeling

Author

Weng, Chao ; Juang, Biing-Hwang

Author_Institution

Center for Signal & Image Process., Georgia Inst. of Technol., Atlanta, GA, USA

fYear

2011

fDate

22-27 May 2011

Firstpage

5332

Lastpage

5335

Abstract

In this paper, we extend our previous study on discriminative training using non-uniform criteria for speech recognition. The work will put emphasis on how the acoustic modeling interacts with the risk at a higher level, which is more relevant to the most used evaluation measures, e.g., word error rate (WER). To be specific, the non-uniform error cost is first derived at the word level to minimize the risk w.r.t. WER and then computed on the word lattice using the forward-backward algorithm. With the statistics obtained from the forward-backward algorithm, the competing hypotheses for each label word are searched by performing dynamic programming between the label word sequence and the word lattice at the phone level. In order to alleviate the level inconsistency between the acoustic model (phone level) and the evaluation measure (word level), the derived error cost is embedded into the overall objective function in a cross-level fashion. Experiments on a large vocabulary task WSJO demonstrate the effectiveness of the overall approach, which show it outperforms two prevalent discriminative training methods and achieves about 13% relative improvement over the baseline system.

Keywords

dynamic programming; speech recognition; WER; cross-level acoustic modeling; discriminative training; dynamic programming; forward-backward algorithm; nonuniform criteria; prevalent discriminative training methods; speech recognition; word error rate; Acoustics; Dynamic programming; Hidden Markov models; Lattices; Measurement uncertainty; Speech recognition; Training; discriminative training; non-uniform error cost; word lattice;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on

Conference_Location

Prague

ISSN

1520-6149

Print_ISBN

978-1-4577-0538-0

Electronic_ISBN

1520-6149

Type

conf

DOI

10.1109/ICASSP.2011.5947562

Filename

5947562