DocumentCode :
311036
Title :
Japanese large-vocabulary continuous-speech recognition using a business-newspaper corpus
Author :
Matsuoka, T. ; Ohtsuki, Katsutoshi ; Mori, Takeshi ; Yoshida, Kotaro ; Furui, Sadaoki ; Shirai, Katsuhiko
Author_Institution :
NTT Human Interface Labs., Musashino, Japan
Volume :
3
fYear :
1997
fDate :
21-24 Apr 1997
Firstpage :
1803
Abstract :
A large-vocabulary continuous-speech recognition (LVCSR) system was developed and evaluated. To evaluate the system, a Japanese business-newspaper speech corpus was designed and recorded. The corpus was designed so that is can be used for Japanese LVCSR research in the same way that the Wall Street Journal (WSJ) corpus, for example, is used for English LVCSR research. Since Japanese sentences are written without spaces between words, a morphological analysis was introduced to segment sentences into words so that word n-gram language models could be used. To enable the use of detailed word n-gram (n⩾3) language models, a two-pass decoding strategy was applied. Context-dependent (CD) phone models and word trigram language models reduced the word error rate from 80.2% to 10.1% (an error reduction of about 88%). This result shows that CD phoneme modeling and word trigram language models can be used effectively in Japanese LVCSR
Keywords :
decoding; mathematical morphology; natural languages; speech processing; speech recognition; Japanese; acoustic modelling; business-newspaper speech corpus; context-dependent phone models; large-vocabulary continuous-speech recognition; morphological analysis; sentence segmentation; two-pass decoding strategy; word error rate reduction; word n-gram language models; word trigram language models; Context modeling; Error analysis; Frequency; Humans; Laboratories; Natural languages; Speech analysis; Speech recognition; Testing; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location :
Munich
ISSN :
1520-6149
Print_ISBN :
0-8186-7919-0
Type :
conf
DOI :
10.1109/ICASSP.1997.598886
Filename :
598886
Link To Document :
بازگشت