Title :
Round-Robin Duel Discriminative Language Models
Author :
Oba, Takanobul ; Hori, Takaaki ; Nakamura, Atsushi ; Ito, Akinori
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
fDate :
5/1/2012 12:00:00 AM
Abstract :
Discriminative training has received a lot of attention from both the machine learning and speech recognition communities. The idea behind the discriminative approach is to construct a model that distinguishes correct samples from incorrect samples, while the conventional generative approach estimates the distributions of correct samples. We propose a novel discriminative training method and apply it to a language model for reranking speech recognition hypotheses. Our proposed method has round-robin duel discrimination (R2D2) criteria in which all the pairs of sentence hypotheses including pairs of incorrect sentences are distinguished from each other, taking their error rate into account. Since the objective function is convex, the global optimum can be found through a normal parameter estimation method such as the quasi-Newton method. Furthermore, the proposed method is an expansion of the global conditional log-linear model whose objective function corresponds to the conditional random fields. Our experimental results show that R2D2 outperforms conventional methods in many situations, including different languages, different feature constructions and different difficulties.
Keywords :
learning (artificial intelligence); natural language processing; parameter estimation; speech recognition; conditional random fields; discriminative training; error rate; feature constructions; global conditional log-linear model; machine learning; normal parameter estimation method; objective function; quasiNewton method; round-robin duel discriminative language models; sentence hypotheses; speech recognition communities; Data models; Error analysis; Speech; Speech processing; Speech recognition; Training; Vectors; Discriminative language model; error correction; round-robin duel discrimination (R2D2);
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2011.2174225