Title :
Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis
Author :
Muthukumar, Prasanna Kumar ; Black, Alan W.
Author_Institution :
Language Technol. Inst., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
Speech synthesis systems are typically built with speech data and transcriptions. In this paper, we try to build synthesis systems when no transcriptions or knowledge about the language are available. It is usually necessary to at least possess phonetic knowledge about the language. In this paper, we propose an automated way of obtaining phones and phonetic knowledge about the corpus at hand by making use of Articulatory Features (AFs). An Articulatory Feature predictor is trained on a bootstrap corpus in an arbitrary other language using a three-hidden layer neural network. This neural network is run on the speech corpus to extract AFs. Hierarchical clustering is used to cluster the AFs into categories i.e. phones. Phonetic information about each of these inferred phones is obtained by computing the mean of the AFs in each cluster. Results of systems built with this framework in multiple languages are reported.
Keywords :
neural nets; pattern clustering; speech synthesis; statistical analysis; AF; articulatory feature predictor; bootstrap corpus; hierarchical clustering; phonetic inventory; phonetic knowledge; speech corpus; speech data; speech transcriptions; statistical speech synthesis; three-hidden layer neural network; unwritten languages; Feature extraction; Speech; Speech recognition; Speech synthesis; Synthesizers; Speech synthesis; TTS without text; articulatory features; neural networks; un-labeled speech corpora;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6854069