DocumentCode
134345
Title
Learning model-based F0 production through goal-directed babbling
Author
Hao Liu ; Yi Xu
Author_Institution
Dept. of Speech, Hearing & Phonetic Sci., Univ. Coll. London, London, UK
fYear
2014
fDate
12-14 Sept. 2014
Firstpage
284
Lastpage
288
Abstract
How surface acoustics can be mapped to underlying articulatory commands is a central yet unsolved issue about speech acquisition. Previously, stochastic optimization has been shown to be proficient in learning underlying pitch targets of the quantitative Target Approximation (qTA) model. The present study tested whether it is possible to develop an acoustic-to-articulatory inverse model for qTA by taking advantages of a recent advance in inverse kinematics learning in the field of developmental robotics, known as goal babbling. By treating traditionally separated babbling and imitation stages of speech acquisition as a unified acoustic goal-directed babbling process, the inverse model implemented by a multilayer perceptron (MLP) can be bootstrapped rapidly without the necessity of exploring the whole articulatory command space. The MLP was trained in online mode with self-generated examples obtained after every production of the host learner. The results show that with this novel learning paradigm the inverse model can be improved in a progressive manner and underlying pitch targets can be obtained by querying the mature inverse model. Our findings also demonstrate that qTA is an intrinsically robust F0 production model that can be operated by various learning regimens.
Keywords
learning (artificial intelligence); multilayer perceptrons; optimisation; speech processing; MLP; acoustic-to-articulatory inverse model; articulatory command space; articulatory commands; developmental robotics; inverse kinematics learning; learning model-based F0 production; learning regimens; multilayer perceptron; pitch targets; qTA; quantitative target approximation model; self-generated examples; speech acquisition; speech acquisition imitation stages; stochastic optimization; surface acoustics; traditionally separated babbling; unified acoustic goal-directed babbling process; Acoustics; Adaptation models; Production; Robot sensing systems; Space exploration; Speech; Stochastic processes; F0 contour modeling; acoustic-to-articulatory inversion; online learning; sensorimotor coordination; speech acquisition; speech production; target approximation;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location
Singapore
Type
conf
DOI
10.1109/ISCSLP.2014.6936720
Filename
6936720
Link To Document