How to train a discriminative front end with stochastic gradient descent and maximum mutual information

Author

Droppo, Jasha ; Mahajan, Milind ; Gunawardana, Asela ; Acero, Alex

Author_Institution

Speech Technol. Group, Microsoft Res., Redmond, WA

fYear

2005

fDate

27-27 Nov. 2005

Firstpage

41

Lastpage

46

Abstract

This paper presents a general discriminative training method for the front end of an automatic speech recognition system. The SPLICE parameters of the front end are trained using stochastic gradient descent (SGD) of a maximum mutual information (MMI) objective function. SPLICE is chosen for its ability to approximate both linear and non-linear transformations of the feature space. SGD is chosen for its simplicity of implementation. Results are presented on both the Aurora 2 small vocabulary task and the WSJ Nov-92 medium vocabulary task. It is shown that the discriminative front end is able to consistently increase system accuracy across different front end configurations and tasks

Keywords

acoustic signal processing; gradient methods; speech recognition; stochastic processes; SPLICE parameters; automatic speech recognition system; discriminative front end; maximum mutual information; stereo piecewise linear compensation for environment; stochastic gradient descent; Automatic speech recognition; Cepstral analysis; Cepstrum; Feature extraction; Filtering; Linear approximation; Mutual information; Speech recognition; Stochastic processes; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Automatic Speech Recognition and Understanding, 2005 IEEE Workshop on

Conference_Location

San Juan

Print_ISBN

0-7803-9478-X

Electronic_ISBN

0-7803-9479-8

Type

conf

DOI

10.1109/ASRU.2005.1566501

Filename

1566501