DocumentCode
1937262
Title
Multilevel annotation of speech signals using weighted finite state transducers
Author
Paulo, Sérgio ; Oliveira, Luís
Author_Institution
Spoken Language Syst. Lab, INESC, Lisbon, Portugal
fYear
2002
fDate
11-13 Sept. 2002
Firstpage
111
Lastpage
114
Abstract
The purpose of this work was the development of a set of tools to automate the process of multilevel annotation of speech signals, preserving the alignments of the utterance´s different levels of the linguistic representation. Our goal is to build speech databases, using speech from non professional speakers with multilevel relational annotations, that can be used for the development of concatenative-based text-to-speech synthesizers or for training and testing statistical models. The method is based on the linguistic analysis of the transcription of the spoken material performed by a TTS system. The predicted phone sequence is then compared with the sequence produced by the speaker. The problem of aligning these two sequences is solved in a language-independent way using Weighted Finite State Transducers. After the alignment, a re-synchronization procedure is applied to the remaining levels to put them in agreement with the spoken utterance.
Keywords
linguistics; speech processing; speech synthesis; statistical analysis; TTS system; concatenative-based text-to-speech synthesizers; linguistic analysis; linguistic representation; multilevel relational annotations; predicted phone sequence; speech databases; speech signals; statistical model testing; statistical model training; weighted finite state transducers; Natural languages; Performance analysis; Relational databases; Signal processing; Speech processing; Speech recognition; Speech synthesis; Synthesizers; Testing; Transducers;
fLanguage
English
Publisher
ieee
Conference_Titel
Speech Synthesis, 2002. Proceedings of 2002 IEEE Workshop on
Print_ISBN
0-7803-7395-2
Type
conf
DOI
10.1109/WSS.2002.1224384
Filename
1224384
Link To Document