DocumentCode
2664759
Title
Language processing for name and address reading in Hungarian
Author
Németh, Géza ; Zainkó, Csaba ; Kiss, Géza ; Fék, Márk ; Gordos, Géza ; Olaszy, Gábor
Author_Institution
Dept. of Telecommun. & Media Informatics, Budapest Univ. of Technol. & Econ., Hungary
fYear
2003
fDate
26-29 Oct. 2003
Firstpage
238
Lastpage
243
Abstract
Name and address reading is an important combined application area of language processing and text-to-speech (TTS) systems. It is the cornerstone of both traditional reverse directory telephone services and new, location based, traffic and tour guide applications. The language processing aspects of a solution for Hungarian is described. The work was based on the analysis of a subscriber database containing about 3 million records (there are about 10 million Hungarian citizens). Categories of name and address elements were defined. A program for the automatic classification of database records was developed. Statistical parameters were derived about proper/legal names and addresses. Based on these results text corpora for enriching the TTS acoustic database were designed. Reading strategies and related special algorithms and tables were developed for the description of complex name categories. Our results may be applied for similar tasks of other languages with comparable linguistic and statistical features.
Keywords
database management systems; linguistics; natural languages; speech synthesis; Hungarian; Hungarian citizen; TTS acoustic database; address reading; automatic syllabification; corpus analysis; directory telephone service; language processing; name reading; speech synthesis; statistical parameter; subscriber database; text-to-speech system; tour guide application; Acoustic applications; Continuous improvement; Databases; GSM; Humans; Informatics; Laboratories; Natural languages; Speech synthesis; Telephony;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003 International Conference on
Conference_Location
Beijing, China
Print_ISBN
0-7803-7902-0
Type
conf
DOI
10.1109/NLPKE.2003.1275906
Filename
1275906
Link To Document