• DocumentCode
    2179399
  • Title

    Using F0 to constrain the unit selection Viterbi network

  • Author

    Conkie, Alistair ; Syrdal, Ann K.

  • Author_Institution
    AT&T Labs. - Res., Florham Park, NJ, USA
  • fYear
    2011
  • fDate
    22-27 May 2011
  • Firstpage
    5376
  • Lastpage
    5379
  • Abstract
    The goal of the work described here is to limit the computation needed in unit selection Viterbi search for text-to-speech synthesis. The broader goal is to improve speech quality through the practical use of significantly larger databases. We focus in this paper on trying to reduce the number of concatenation cost calculations. By making certain weak assumptions about f0 distributions we estimate that only a fraction of possible concatenations are relevant. A method for selecting the relevant concatenations by imposing an ordering constraint on candidate units is proposed. The ordering is based on unit f0 value(s). Strengths and weaknesses of this approach are discussed and data is presented about calculation complexity compared with naive Viterbi search. A listening test was conducted to investigate the effect on synthesis quality under various configurations of algorithm and database.
  • Keywords
    maximum likelihood estimation; search problems; speech synthesis; Viterbi search; text-to-speech synthesis; unit selection Viterbi network; CMOS integrated circuits; Complexity theory; Databases; Silicon; Speech; Synthesizers; Viterbi algorithm; concatenation costs; speech synthesis; unit selection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
  • Conference_Location
    Prague
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4577-0538-0
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2011.5947573
  • Filename
    5947573