DocumentCode :
3642760
Title :
Text normalization for Croatian speech synthesis
Author :
Slobodan Beliga;Sanda Martinčić-Ipšić
Author_Institution :
Department of Informatics, University of Rijeka, Omladinska 14, 51000, Rijeka, Croatia
fYear :
2011
fDate :
5/1/2011 12:00:00 AM
Firstpage :
1664
Lastpage :
1669
Abstract :
This paper presents text normalization which is an integral part of any text-to-speech (TTS) synthesis system. Text normalization is a set of methods with a task to write non-standard words (NSW) in full expanded form. The algorithms which transform NSW into Croatian text: numbers, dates, times, abbreviations, acronyms and the most common symbols into their expanded form are presented. The whole taxonomy for classification of non-standard words in Croatian language together with rule-based normalization methods combined with a lookup dictionary are proposed. The paper concludes with a discussion on the possible integration of proposed text normalization into the existing text-to-speech synthesis system.
Keywords :
"Classification algorithms","Taxonomy","Classification tree analysis","Speech","Dictionaries"
Publisher :
ieee
Conference_Titel :
MIPRO, 2011 Proceedings of the 34th International Convention
Print_ISBN :
978-1-4577-0996-8
Type :
conf
Filename :
5967328
Link To Document :
بازگشت