DocumentCode :
312033
Title :
An evaluation of statistical language modeling for speech recognition using a mixed category of both words and parts-of-speech
Author :
Wakita, Yumi ; Kawai, Jun ; Iida, Hitoshi
Author_Institution :
ATR Interpreting Telecommun. Res. Lab., Kyoto, Japan
Volume :
1
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
530
Abstract :
In our previous paper, we proposed a mixed category of words and parts-of-speech names of the MWP category based on class N-gram modeling (Kawai et al., 1995). However, we had not confirmed the efficiency of the MWP category. In this paper, we evaluate the proposed MWP category. At first we use “coverage of words and category sequences to open data” and “perplexity to training data” for the evaluation and we confirmed that the characteristics of parts-of-speech are useful for generating a suitable class N-gram modeling. As a result of the speech recognition experimentation, we also confirmed that the class N-gram modeling using the MWP category is effective in improving the recognition rate for open data that shows a low coverage of words and category sequences, without decreasing the recognition rate much for closed data
Keywords :
computational linguistics; natural languages; probability; speech recognition; statistical analysis; MWP category; category sequences; class N-gram modeling; closed data; parts-of-speech; probability; speech recognition; statistical language modeling; words; Character generation; Computer aided analysis; Entropy; Frequency; Natural languages; Predictive models; Smoothing methods; Speech recognition; Testing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607171
Filename :
607171
Link To Document :
بازگشت