• DocumentCode
    1693866
  • Title

    Multi-distribution deep belief network for speech synthesis

  • Author

    Shiyin Kang ; Xiaojun Qian ; Meng, Hsiang-Yun

  • Author_Institution
    Dept. of Syst. Eng. & Eng. Manage., Chinese Univ. of Hong Kong, Hong Kong, China
  • fYear
    2013
  • Firstpage
    8012
  • Lastpage
    8016
  • Abstract
    Deep belief network (DBN) has been shown to be a good generative model in tasks such as hand-written digit image generation. Previous work on DBN in the speech community mainly focuses on using the generatively pre-trained DBN to initialize a discriminative model for better acoustic modeling in speech recognition (SR). To fully utilize its generative nature, we propose to model the speech parameters including spectrum and F0 simultaneously and generate these parameters from DBN for speech synthesis. Compared with the predominant HMM-based approach, objective evaluation shows that the spectrum generated from DBN has less distortion. Subjective results also confirm the advantage of the spectrum from DBN, and the overall quality is comparable to that of context-independent HMM.
  • Keywords
    belief networks; handwriting recognition; hidden Markov models; speech recognition; speech synthesis; DBN; HMM-based approach; SR; acoustic modeling; handwritten digit image generation; multidistribution deep belief network; speech community; speech parameters; speech recognition; speech synthesis; Acoustics; Hidden Markov models; Speech; Speech recognition; Speech synthesis; Training; Deep belief network; Speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639225
  • Filename
    6639225