Title :
A two level lexical stress assignment model for highly inflected Slovenian language
Author_Institution :
Dept. of Intelligent Syst., Inst. "Jozef Stefan", Ljubljana, Slovenia
Abstract :
The paper presents a two level lexical stress assignment model for out of vocabulary Slovenian words used in our text-to-speech system. First, each vowel is determined, whether it is stressed or unstressed, and a type of lexical stress is assigned for every stressed vowel. Then, some corrections are made on the word level, according the number of stressed vowels and the length of the word. We applied a machine-learning technique (decision trees or boosted decision trees). The accuracy achieved by decision trees significantly outperforms all previous results. However, the sizes of the trees indicate that the accentuation in the Slovenian language is a very complex problem and a simple solution in the form of relatively simple rules is not possible.
Keywords :
learning (artificial intelligence); natural languages; speech synthesis; vocabulary; Slovenian word; accentuation; boosted decision trees; inflected Slovenian language; machine learning; stressed vowel; text-to-speech system; two level lexical stress assignment model; unstressed vowel; vocabulary; vowel determination; word length; word level correction; Databases; Decision trees; Dictionaries; Intelligent systems; Natural languages; Personal digital assistants; Speech synthesis; Stress; Tree graphs; Vocabulary;
Conference_Titel :
Information Technology and Applications, 2005. ICITA 2005. Third International Conference on
Print_ISBN :
0-7695-2316-1
DOI :
10.1109/ICITA.2005.48