DocumentCode
178428
Title
Automatic discovery of a phonetic inventory for unwritten languages for statistical speech synthesis
Author
Muthukumar, Prasanna Kumar ; Black, Alan W.
Author_Institution
Language Technol. Inst., Carnegie Mellon Univ., Pittsburgh, PA, USA
fYear
2014
fDate
4-9 May 2014
Firstpage
2594
Lastpage
2598
Abstract
Speech synthesis systems are typically built with speech data and transcriptions. In this paper, we try to build synthesis systems when no transcriptions or knowledge about the language are available. It is usually necessary to at least possess phonetic knowledge about the language. In this paper, we propose an automated way of obtaining phones and phonetic knowledge about the corpus at hand by making use of Articulatory Features (AFs). An Articulatory Feature predictor is trained on a bootstrap corpus in an arbitrary other language using a three-hidden layer neural network. This neural network is run on the speech corpus to extract AFs. Hierarchical clustering is used to cluster the AFs into categories i.e. phones. Phonetic information about each of these inferred phones is obtained by computing the mean of the AFs in each cluster. Results of systems built with this framework in multiple languages are reported.
Keywords
neural nets; pattern clustering; speech synthesis; statistical analysis; AF; articulatory feature predictor; bootstrap corpus; hierarchical clustering; phonetic inventory; phonetic knowledge; speech corpus; speech data; speech transcriptions; statistical speech synthesis; three-hidden layer neural network; unwritten languages; Feature extraction; Speech; Speech recognition; Speech synthesis; Synthesizers; Speech synthesis; TTS without text; articulatory features; neural networks; un-labeled speech corpora;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location
Florence
Type
conf
DOI
10.1109/ICASSP.2014.6854069
Filename
6854069
Link To Document