DocumentCode :
3246310
Title :
Approach toward speech-to-speech translation system by using a collection of sentences and utterances
Author :
Sumita, Eiichiro ; Nakaiwa, Hiromi ; Kikui, Genichiro ; Yamamoto, Seiichi
Author_Institution :
ATR Spoken Language Translation Res. Labs., Kyoto, Japan
fYear :
2003
fDate :
30 Nov.-3 Dec. 2003
Firstpage :
652
Lastpage :
657
Abstract :
Corpus-based technology is very promising for speech-to-speech translation. However, the problem is that it is prohibitively expensive to build the vital resource, a large-scale corpus of bilingual dialogues covering many domains. We propose to substitute a combination of two different types of bilingual corpora: (1) a large-scale collection of basic sentences that covers many domains; and (2) a small-scale collection of spoken dialogues that reflects the characteristics of the spoken utterances for the large-scale corpus of dialogues. With these two corpora, we have been building a translation module for a speech-to-speech translation system. By using the basic sentence corpus, we have achieved high-quality translations with several machine-learning approaches. Based on an analysis of the spoken dialogue corpus, we found that splitting utterances into parts and concatenating the translated parts is an effective way to translate the longer utterances that are inherent in a spoken dialogue.
Keywords :
language translation; learning (artificial intelligence); speech recognition; speech synthesis; bilingual dialogue corpus; corpus-based technology; machine learning methods; sentence collection method; speech-to-speech translation system; spoken dialogue utterance splitting; utterance collection method; Cities and towns; Humans; Laboratories; Large-scale systems; Machine learning; Natural languages; Oral communication; Speech; System testing; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
Type :
conf
DOI :
10.1109/ASRU.2003.1318517
Filename :
1318517
Link To Document :
بازگشت