مرکز منطقه ای اطلاع رساني علوم و فناوري - Text normalization for Croatian speech synthesis

DocumentCode :

3642760

Title :

Text normalization for Croatian speech synthesis

Author :

Slobodan Beliga;Sanda Martinčić-Ipšić

Author_Institution :

Department of Informatics, University of Rijeka, Omladinska 14, 51000, Rijeka, Croatia

fYear :

2011

fDate :

5/1/2011 12:00:00 AM

Firstpage :

1664

Lastpage :

1669

Abstract :

This paper presents text normalization which is an integral part of any text-to-speech (TTS) synthesis system. Text normalization is a set of methods with a task to write non-standard words (NSW) in full expanded form. The algorithms which transform NSW into Croatian text: numbers, dates, times, abbreviations, acronyms and the most common symbols into their expanded form are presented. The whole taxonomy for classification of non-standard words in Croatian language together with rule-based normalization methods combined with a lookup dictionary are proposed. The paper concludes with a discussion on the possible integration of proposed text normalization into the existing text-to-speech synthesis system.

Keywords :

"Classification algorithms","Taxonomy","Classification tree analysis","Speech","Dictionaries"

Publisher :

ieee

Conference_Titel :

MIPRO, 2011 Proceedings of the 34th International Convention

Print_ISBN :

978-1-4577-0996-8

Type :

conf

Filename :

5967328

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3642760