• DocumentCode
    1843926
  • Title

    Pitch-scaled spectrum based excitation model for HMM-based Speech Synthesis

  • Author

    Zhengqi Wen ; Jianhua Tao ; Hain, H.-U.

  • Author_Institution
    Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
  • Volume
    1
  • fYear
    2012
  • fDate
    21-25 Oct. 2012
  • Firstpage
    609
  • Lastpage
    612
  • Abstract
    The quality of speech generated from Hidden Markov Model (HMM)-based Speech Synthesis System (HTS) is suffered from `buzzing´ problem which is due to oversimplified vocoding technique. This paper proposed an excitation model to improve the parametric representation of speech in HTS. Residual got from inverse filtering keeps some detailed harmonic structure of speech which has not be included in linear prediction (LP) spectrum. Pitch-scaled spectrum can be used as a supplement of LP spectrum in speech reconstruction. This spectrum is compressed by principal component analysis (PCA) and eigenvalues are indicated as periodic parameter. Then, an aperiodic measure is also extracted from pitch-scaled spectrum and a sigmoid function is fitted to this measure as aperiodic parameter. These two parameters are integrated into HTS training as excitation parameter. Listening tests showed that this proposed technique could generate better sound than pulse train excitation model and take a comparable result with STRAIGHT.
  • Keywords
    filtering theory; hidden Markov models; principal component analysis; speech synthesis; HMM based speech synthesis; LP spectrum; PCA; excitation model; hidden Markov model; inverse filtering; linear prediction spectrum; parametric representation; periodic parameter; pitch scaled spectrum; principal component analysis; speech harmonic structure; speech reconstruction; speech synthesis system; vocoding technique; HMM-based Speech Synthesis; excitaton model; linear prediction; pitch-scaled spectrum; principal component analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing (ICSP), 2012 IEEE 11th International Conference on
  • Conference_Location
    Beijing
  • ISSN
    2164-5221
  • Print_ISBN
    978-1-4673-2196-9
  • Type

    conf

  • DOI
    10.1109/ICoSP.2012.6491561
  • Filename
    6491561