DocumentCode :
180512
Title :
Improvement of Probabilistic Acoustic Tube model for speech decomposition
Author :
Yang Zhang ; Zhijian Ou ; Hasegawa-Johnson, Mark
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Illinois, Urbana, IL, USA
fYear :
2014
fDate :
4-9 May 2014
Firstpage :
7929
Lastpage :
7933
Abstract :
Current model-based speech analysis tends to be incomplete - only a part of parameters of interest (e.g. only the pitch or vocal tract) are modeled, while the rest that might as well be important are disregarded. The drawback is that without joint modeling of parameters that are correlated, the analysis on speech parameters may be inaccurate or even incorrect. Under this motivation, we have proposed such a model called PAT (Probabilistic Acoustic Tube), where pitch, vocal tract and energy are jointly modeled. This paper proposes an improved version of PAT model, named PAT2, where both signal and probabilistic modeling are tremendously renovated. Compared to related works, PAT2 is much more comprehensive, which incorporates mixed excitation, glottal wave and phase modeling. Experimental results show its ability in decomposing speech into desirable parameters and its potential for speech synthesis.
Keywords :
probability; speech processing; PAT model; PAT2; probabilistic acoustic tube model; probabilistic modeling; signal modeling; speech analysis; speech decomposition; speech parameters; vocal tract; Discrete Fourier transforms; Image reconstruction; Mel frequency cepstral coefficient; Probabilistic logic; Speech; Speech processing; Probabilistic generative model; model-based speech processing; speech modeling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
Type :
conf
DOI :
10.1109/ICASSP.2014.6855144
Filename :
6855144
Link To Document :
بازگشت