DocumentCode :
1739124
Title :
Estimation and generalization of multimodal speech production
Author :
Vatikiotis-Bateson, Eric ; Yehia, Hani C.
Author_Institution :
ATR Inf. Sci. Div., Commun. Dynamics Project, Kyoto, Japan
Volume :
1
fYear :
2000
fDate :
2000
Firstpage :
23
Abstract :
The speech acoustics and the phonetically relevant motion of the face during speech are determined by the time-varying behavior of the vocal tract. A benefit of this linkage is that face motion can be predicted from the spectral acoustics during sentence production. However, the scope of reliable estimation appears to be limited to individual sentences, because the analysis degrades sharply when multiple sentences are analyzed together, suggesting sentence length boundary constraints. These constraints are examined in this paper
Keywords :
acoustics; speech; speech processing; time-varying systems; degradation; generalization; multimodal speech production; phonetically relevant face motion; reliable estimation; sentence length boundary constraints; sentence production; spectral acoustics; speech acoustics; time-varying behavior; vocal tract; Acoustics; Continuous production; Couplings; Degradation; Facial animation; Humans; Motion estimation; Neural networks; Speech enhancement; Speech processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks for Signal Processing X, 2000. Proceedings of the 2000 IEEE Signal Processing Society Workshop
Conference_Location :
Sydney, NSW
ISSN :
1089-3555
Print_ISBN :
0-7803-6278-0
Type :
conf
DOI :
10.1109/NNSP.2000.889358
Filename :
889358
Link To Document :
بازگشت