• DocumentCode
    3165482
  • Title

    Improved concept-to-speech generation in a dialogue system on road guidance

  • Author

    Yagi, Yuji ; Hirose, Keikichi ; Takada, Seiya ; Minematsu, Nobuaki

  • Author_Institution
    Graduate Sch. of Eng., Tokyo Univ.
  • fYear
    2005
  • fDate
    23-25 Nov. 2005
  • Lastpage
    436
  • Abstract
    Although in most spoken dialogue systems, text-to-speech conversion devices are used for reply speech generation. However, use of such devices makes it difficult to well reflect higher-level linguistic (and para-/non- linguistic) information obtainable during sentence generation process on reply speech. This situation degrades the reply speech quality mainly from the aspect of prosodic features. A method is necessary to directly converting content of reply into speech. This method, known as concept-to-speech conversion, was realized for the reply speech generation in our spoken dialogue system on road guidance. It is an improved version of our formerly developed one for an agent dialogue system. Reply sentence generation was conducted by pasting words and/or phrases at tag positions of a sentence frame, which was prepared in a tag-LISP form. In order to realize the concept-to-speech conversion, syntactic structure of phrases in user´s inputs is kept and is utilized for the sentence generation. Several improvements, such as prosodic phrase boundary positioning using probability of word sequences, are also added to prosodic control in speech synthesis. In the spoken dialogue system, a user was guided to reach a place marked on a map through conversation. Several schemes on dialogue management were implemented to solve the problems caused due to the imperfect information on the roads given to the user and the system. A trial use of the system showed that a smooth conversation between the user and the system was possible. The result clearly indicated a better prosodic control for the newly developed method as compared to the original method
  • Keywords
    natural languages; speech synthesis; traffic engineering computing; agent dialogue system; concept-to-speech generation; prosodic features; prosodic phrase boundary positioning; reply sentence generation; reply speech generation; road guidance; spoken dialogue systems; tag-LISP form; Communication system control; Computer displays; Control systems; Degradation; Humans; Information science; Roads; Speech processing; Speech synthesis; System testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cyberworlds, 2005. International Conference on
  • Conference_Location
    Singapore
  • Print_ISBN
    0-7695-2378-1
  • Type

    conf

  • DOI
    10.1109/CW.2005.53
  • Filename
    1587574