DocumentCode
3558768
Title
Optimizing the Performance of Spoken Language Recognition With Discriminative Training
Author
Zhu, Donglai ; Li, Haizhou ; Bin Ma ; Lee, Chin-Hui
Author_Institution
Human Language Technol. Dept., Inst. for Infocomm Res., Singapore
Volume
16
Issue
8
fYear
2008
Firstpage
1642
Lastpage
1653
Abstract
The performance of spoken language recognition system is typically formulated to reflect the detection cost and the strategic decision points along the detection-error-tradeoff curve. We propose a performance metrics optimization (PMO) approach to optimizing the detection performance of Gaussian mixture model classifiers. We design the objective functions to directly relate the model parameters to the performance metrics of interest, i.e., the detection cost function and the area under the detection-error-tradeoff curve. Both metrics are approximated by differentiable functions of model parameters. In this way, the model parameters can be optimized with the generalized probabilistic descent algorithm, a typical discriminative training technique. We conduct the experiments on the NIST 2003 and 2005 Language Recognition Evaluation corpora. The experimental results show that the PMO approach effectively improves the performance over the maximum-likelihood training approach.
Keywords
Gaussian processes; function approximation; learning (artificial intelligence); natural language processing; optimisation; pattern classification; probability; speech recognition; Gaussian mixture model classifiers; detection-error-tradeoff curve; differentiable function approximation; discriminative training; generalized probabilistic descent algorithm; performance metrics optimization; spoken language recognition; Acoustic testing; Cost function; Detectors; Distribution functions; Error analysis; Humans; Maximum likelihood detection; Measurement; NIST; Natural languages; Classifier optimization; detection error tradeoff (DET); discriminative training; spoken language recognition (SLR);
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2008.2005319
Filename
4648923
Link To Document