Title :
Pseudo-Conventional N-Gram Representation of the Discriminative N-Gram Model for LVCSR
Author :
Zhou, Zhengyu ; Meng, Helen
Author_Institution :
Human-Comput. Commun. Lab., Chinese Univ. of Hong Kong, Shatin, China
Abstract :
The discriminative n-gram modeling approach re-ranks the N-best hypotheses generated during decoding and can effectively improve the performance of large-vocabulary continuous speech recognition (LVCSR). This work recasts the discriminative n-gram model as a pseudo-conventional n-gram model. The recast enables the power of discriminative n-gram modeling to be conveniently incorporated in a single-pass decoding procedure. We also propose an efficient method to apply the pseudo model to rescore the recognition lattices generated during decoding. Experimental results show that when the test data is similar in nature to the training data, applying the pseudo model to rescore the recognition lattices can achieve better performance and efficiency, when compared with discriminative N-best re-ranking (i.e., re-ranking the N-best hypotheses with the discriminative n-gram model). We demonstrate that in this case, applying the pseudo model in decoding can be even more advantageous. However, when the test data is different in nature from the training data, discriminative N -best re-ranking may offer greater benefits than pseudo-model based lattice rescoring or decoding. Based on the pseudo-conventional n-gram representation, we also investigate the feasibility of combining discriminative n-gram modeling with other recognition post-processes and demonstrate that cumulative performance improvements can be achieved.
Keywords :
decoding; signal representation; speech coding; speech recognition; LVCSR; N-best hypotheses; discriminative N-best re-ranking; discriminative n-gram modeling approach; large-vocabulary continuous speech recognition; pseudo-conventional N-gram representation; pseudo-model based lattice rescoring; recognition lattices; single-pass decoding; Error analysis; Laboratories; Lattices; Maximum likelihood decoding; Maximum likelihood estimation; Parameter estimation; Speech recognition; State estimation; Testing; Training data; Discriminative n-gram modeling; large-vocabulary continuous speech recognition (LVCSR);
Journal_Title :
Selected Topics in Signal Processing, IEEE Journal of
DOI :
10.1109/JSTSP.2010.2047675