DocumentCode :
2178713
Title :
Syllabification of conversational speech using Bidirectional Long-Short-Term Memory Neural Networks
Author :
Landsiedel, Christian ; Edlund, Jens ; Eyben, Florian ; Neiberg, Daniel ; Schuller, Björn
Author_Institution :
Dept. for Speech, Music & Hearing, R. Inst. of Technol., Stockholm, Sweden
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
5256
Lastpage :
5259
Abstract :
Segmentation of speech signals is a crucial task in many types of speech analysis. We present a novel approach at segmentation on a syllable level, using a Bidirectional Long-Short-Term Memory Neural Network. It performs estimation of syllable nucleus positions based on regression of perceptually motivated input features to a smooth target function. Peak selection is performed to attain valid nuclei positions. Performance of the model is evaluated on the levels of both syllables and the vowel segments making up the syllable nuclei. The general applicability of the approach is illustrated by good results for two common databases-Switchboard and TIMIT-for both read and spontaneous speech, and a favourable comparison with other published results.
Keywords :
recurrent neural nets; speech synthesis; TIMIT; bidirectional long-short-term memory neural networks; smooth target function; speech analysis; speech signal segmentation; spontaneous speech; syllabification; syllable nuclei; syllable nucleus positions; Artificial neural networks; Correlation; Rhythm; Speech; Speech recognition; Switches; Training; Dialogue Systems; Recurrent Neural Networks; Speech Analysis; Syllabification;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947543
Filename :
5947543
Link To Document :
بازگشت