DocumentCode
3165482
Title
Improved concept-to-speech generation in a dialogue system on road guidance
Author
Yagi, Yuji ; Hirose, Keikichi ; Takada, Seiya ; Minematsu, Nobuaki
Author_Institution
Graduate Sch. of Eng., Tokyo Univ.
fYear
2005
fDate
23-25 Nov. 2005
Lastpage
436
Abstract
Although in most spoken dialogue systems, text-to-speech conversion devices are used for reply speech generation. However, use of such devices makes it difficult to well reflect higher-level linguistic (and para-/non- linguistic) information obtainable during sentence generation process on reply speech. This situation degrades the reply speech quality mainly from the aspect of prosodic features. A method is necessary to directly converting content of reply into speech. This method, known as concept-to-speech conversion, was realized for the reply speech generation in our spoken dialogue system on road guidance. It is an improved version of our formerly developed one for an agent dialogue system. Reply sentence generation was conducted by pasting words and/or phrases at tag positions of a sentence frame, which was prepared in a tag-LISP form. In order to realize the concept-to-speech conversion, syntactic structure of phrases in user´s inputs is kept and is utilized for the sentence generation. Several improvements, such as prosodic phrase boundary positioning using probability of word sequences, are also added to prosodic control in speech synthesis. In the spoken dialogue system, a user was guided to reach a place marked on a map through conversation. Several schemes on dialogue management were implemented to solve the problems caused due to the imperfect information on the roads given to the user and the system. A trial use of the system showed that a smooth conversation between the user and the system was possible. The result clearly indicated a better prosodic control for the newly developed method as compared to the original method
Keywords
natural languages; speech synthesis; traffic engineering computing; agent dialogue system; concept-to-speech generation; prosodic features; prosodic phrase boundary positioning; reply sentence generation; reply speech generation; road guidance; spoken dialogue systems; tag-LISP form; Communication system control; Computer displays; Control systems; Degradation; Humans; Information science; Roads; Speech processing; Speech synthesis; System testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Cyberworlds, 2005. International Conference on
Conference_Location
Singapore
Print_ISBN
0-7695-2378-1
Type
conf
DOI
10.1109/CW.2005.53
Filename
1587574
Link To Document