• DocumentCode
    2531164
  • Title

    A hybrid duration model using CART and HMM

  • Author

    Gopinath, Deepa P. ; Vinod, C. ; Veena, S.G. ; Achuthsankar, S.N.

  • Author_Institution
    Coll. of Eng., Univ. of Kerala, Thiruvananthapuram
  • fYear
    2008
  • fDate
    19-21 Nov. 2008
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    The duration of phoneme vary dynamically during continuous speech giving rhythm or prosody to speech. To make synthesized speech appear natural, the durational variation is recreated using duration models. This paper proposes a hybrid duration model combining CART and HMM based on duration analysis of phonemes in Malayalam language. The first part of the work analyzes the probability distribution of phonemes. Different probability distributions are fitted to the duration values of each phoneme and the best fit is determined using quantile-quantile plot. The probability distribution of duration values, in accordance with the different factors affecting durations are also analyzed. The duration of phonemes in Malayalam news, along with corresponding feature vector framed based on analysis, formed the training data for Classification and Regression Tree (CART). The HMM is trained to capture deviation between the duration values predicted by CART and actual duration. The HMM proposed in this paper uses a novel multilevel architecture having 10 levels, where each level corresponds to a phoneme of the input word. Each level has 50 states corresponding to deviation of duration values of the phonemes. The combined model predicts duration as the sum of the duration predicted by CART and the deviation value predicted by HMM. The objective evaluation of the model gave an RMSE of 8.32 ms.
  • Keywords
    hidden Markov models; mean square error methods; natural language processing; regression analysis; signal classification; speech processing; speech synthesis; statistical distributions; trees (mathematics); CART; HMM; Malayalam language; Malayalam news phoneme duration model; RMSE; classification-and-regression tree; continuous speech synthesis; feature vector; hidden Markov model; probability distribution; quantile-quantile plot; Classification tree analysis; Databases; Hidden Markov models; Predictive models; Probability distribution; Regression tree analysis; Rhythm; Speech analysis; Speech synthesis; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    TENCON 2008 - 2008 IEEE Region 10 Conference
  • Conference_Location
    Hyderabad
  • Print_ISBN
    978-1-4244-2408-5
  • Electronic_ISBN
    978-1-4244-2409-2
  • Type

    conf

  • DOI
    10.1109/TENCON.2008.4766762
  • Filename
    4766762