Title :
A hybrid duration model using CART and HMM
Author :
Gopinath, Deepa P. ; Vinod, C. ; Veena, S.G. ; Achuthsankar, S.N.
Author_Institution :
Coll. of Eng., Univ. of Kerala, Thiruvananthapuram
Abstract :
The duration of phoneme vary dynamically during continuous speech giving rhythm or prosody to speech. To make synthesized speech appear natural, the durational variation is recreated using duration models. This paper proposes a hybrid duration model combining CART and HMM based on duration analysis of phonemes in Malayalam language. The first part of the work analyzes the probability distribution of phonemes. Different probability distributions are fitted to the duration values of each phoneme and the best fit is determined using quantile-quantile plot. The probability distribution of duration values, in accordance with the different factors affecting durations are also analyzed. The duration of phonemes in Malayalam news, along with corresponding feature vector framed based on analysis, formed the training data for Classification and Regression Tree (CART). The HMM is trained to capture deviation between the duration values predicted by CART and actual duration. The HMM proposed in this paper uses a novel multilevel architecture having 10 levels, where each level corresponds to a phoneme of the input word. Each level has 50 states corresponding to deviation of duration values of the phonemes. The combined model predicts duration as the sum of the duration predicted by CART and the deviation value predicted by HMM. The objective evaluation of the model gave an RMSE of 8.32 ms.
Keywords :
hidden Markov models; mean square error methods; natural language processing; regression analysis; signal classification; speech processing; speech synthesis; statistical distributions; trees (mathematics); CART; HMM; Malayalam language; Malayalam news phoneme duration model; RMSE; classification-and-regression tree; continuous speech synthesis; feature vector; hidden Markov model; probability distribution; quantile-quantile plot; Classification tree analysis; Databases; Hidden Markov models; Predictive models; Probability distribution; Regression tree analysis; Rhythm; Speech analysis; Speech synthesis; Support vector machines;
Conference_Titel :
TENCON 2008 - 2008 IEEE Region 10 Conference
Conference_Location :
Hyderabad
Print_ISBN :
978-1-4244-2408-5
Electronic_ISBN :
978-1-4244-2409-2
DOI :
10.1109/TENCON.2008.4766762