Title :
Learning model-based F0 production through goal-directed babbling
Author_Institution :
Dept. of Speech, Hearing & Phonetic Sci., Univ. Coll. London, London, UK
Abstract :
How surface acoustics can be mapped to underlying articulatory commands is a central yet unsolved issue about speech acquisition. Previously, stochastic optimization has been shown to be proficient in learning underlying pitch targets of the quantitative Target Approximation (qTA) model. The present study tested whether it is possible to develop an acoustic-to-articulatory inverse model for qTA by taking advantages of a recent advance in inverse kinematics learning in the field of developmental robotics, known as goal babbling. By treating traditionally separated babbling and imitation stages of speech acquisition as a unified acoustic goal-directed babbling process, the inverse model implemented by a multilayer perceptron (MLP) can be bootstrapped rapidly without the necessity of exploring the whole articulatory command space. The MLP was trained in online mode with self-generated examples obtained after every production of the host learner. The results show that with this novel learning paradigm the inverse model can be improved in a progressive manner and underlying pitch targets can be obtained by querying the mature inverse model. Our findings also demonstrate that qTA is an intrinsically robust F0 production model that can be operated by various learning regimens.
Keywords :
learning (artificial intelligence); multilayer perceptrons; optimisation; speech processing; MLP; acoustic-to-articulatory inverse model; articulatory command space; articulatory commands; developmental robotics; inverse kinematics learning; learning model-based F0 production; learning regimens; multilayer perceptron; pitch targets; qTA; quantitative target approximation model; self-generated examples; speech acquisition; speech acquisition imitation stages; stochastic optimization; surface acoustics; traditionally separated babbling; unified acoustic goal-directed babbling process; Acoustics; Adaptation models; Production; Robot sensing systems; Space exploration; Speech; Stochastic processes; F0 contour modeling; acoustic-to-articulatory inversion; online learning; sensorimotor coordination; speech acquisition; speech production; target approximation;
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2014 9th International Symposium on
Conference_Location :
Singapore
DOI :
10.1109/ISCSLP.2014.6936720