• DocumentCode
    336785
  • Title

    Phrase splicing and variable substitution using the IBM trainable speech synthesis system

  • Author

    Donovan, R.E. ; Franz, M. ; Sorensen, J.S. ; Roukos, S.

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    1
  • fYear
    1999
  • fDate
    15-19 Mar 1999
  • Firstpage
    373
  • Abstract
    This paper describes a phrase splicing and variable substitution system which offers an intermediate form of automated speech production lying in-between the extremes of recorded utterance playback and full text-to-speech synthesis. The system incorporates a trainable speech synthesiser and an application specific set of pre-recorded phrases. The text to be synthesised is converted to a phone sequence using phone sequences present in the pre-recorded phrases wherever possible, and a pronunciation dictionary elsewhere. The synthesis inventory of the synthesiser is augmented with the synthesis information associated with the pre-recorded phrases used to construct the phone sequence. The synthesiser then performs a dynamic programming search over the augmented inventory to select a segment sequence to produce the output speech. The system enables the seamless splicing of pre-recorded phrases both with other phrases and with synthetic speech. It enables very high quality speech to be produced automatically within a limited domain
  • Keywords
    dynamic programming; search problems; sequences; speech synthesis; IBM trainable speech synthesis system; augmented inventory; automated speech production; dynamic programming search; high quality speech; output speech; phone sequences; phrase splicing; pre-recorded phrases; pronunciation dictionary; recorded utterance playback; segment sequence; synthesis information; synthesis inventory; text-to-speech synthesis; trainable speech synthesiser; variable substitution; Dictionaries; Dynamic programming; Intrusion detection; Mutual funds; Production systems; Speech synthesis; Splicing; Telephony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
  • Conference_Location
    Phoenix, AZ
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-5041-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.1999.758140
  • Filename
    758140