Title :
An automatic method for learning a Japanese lexicon for recognition of spontaneous speech
Author :
Tomokiyo, Laura Mayfield ; Ries, Klaus
Author_Institution :
Interactive Syst. Labs., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
When developing a speech recognition system, one must start by deciding what the units to be recognized should be. This is for the most part a straightforward choice in the case of word-based languages such as English, but becomes an issue even in handling languages with a complex compounding system like German; with an agglutinative language like Japanese, which provides no spaces in written text, the choice is not at all obvious. Once an appropriate unit has been determined, the problem of consistently segmenting transcriptions of training data must be addressed. This paper describes a method for learning a lexicon from a training corpus which contains no word-level segmentation, applied to the problem of building a Japanese speech recognition system. We show not only that one can satisfactorily segment transcribed training data automatically, avoiding human error, but also that our system, when trained with the automatically segmented corpus, showed a significant improvement in recognition performance
Keywords :
natural languages; speech recognition; statistical analysis; English; German; Japanese lexicon learning; Japanese speech recognition system; agglutinative language; automatic method; complex compounding system; recognition performance; spontaneous speech recognition; statistical method; training corpus segmentation; transcribed training data; word-based languages; word-level segmentation; Automatic speech recognition; Character recognition; Dictionaries; Humans; Indium tin oxide; Mutual information; Natural languages; Performance evaluation; Speech recognition; Training data;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.674428