Title :
A Dynamic In-Search Data Selection Method With Its Applications to Acoustic Modeling and Utterance Verification
Author :
Jiang, Hui ; Soong, Frank K. ; Lee, Chin-Hui
Author_Institution :
Dept. of Comput. Sci., Toronto Univ., Ont., Canada
Abstract :
In this paper, we propose a dynamic in-search data selection method to diagnose competing information automatically from speech data. In our method, the Viterbi beam search is used to decode all training data. During decoding, all partial paths within the beam are examined to identify the so-called competing-token and true-token sets for each individual hidden Markov model (HMM). In this work, the collected data tokens are used for acoustic modeling and utterance verification as two specific examples. In acoustic modeling, the true-token sets are used to adapt HMMs with a sequential maximum a posteriori adaptation method, while a generalized probabilistic descent-based discriminative training method is proposed to improve HMMs based on competing-token sets. In utterance verification, under the framework of likelihood ratio testing, the true-token sets are employed to train positive models for the null hypothesis and the competing-token sets are used to estimate negative models for the alternative hypothesis. All the proposed methods are evaluated in Bell Laboratories communicator system. Experimental results show that the new acoustic modeling method can consistently improve recognition performance over our best maximum likelihood estimation models, roughly 1% absolute reduction in word error rate. The results also show the new verification models can significantly improve the performance of utterance verification over the conventional anti models, almost relatively 30% reduction of equal error rate when identifying misrecognized words from the recognition results.
Keywords :
hidden Markov models; maximum likelihood estimation; speech recognition; Bell Laboratories communicator system; Viterbi beam search; acoustic modeling; competing-token sets; dynamic in-search data selection method; hidden Markov model; hypothesis; maximum likelihood estimation models; probabilistic descent-based discriminative training method; sequential maximum a posteriori adaptation method; true-token sets; utterance verification; Acoustic applications; Acoustic beams; Acoustic testing; Error analysis; Hidden Markov models; Maximum likelihood decoding; Maximum likelihood estimation; Speech; Training data; Viterbi algorithm; Competing token; discriminative training; in-search data selection; log likelihood ratio (LLR) testing; sequential maximum a posteriori (MAP) adaptation; true token (TT);
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
DOI :
10.1109/TSA.2005.851947