Title :
A first approach to the evaluation of arabic diacritization systems
Author :
Bahanshal, Alia O. ; Al-Khalifa, Hend S.
Author_Institution :
Comput. Res. Inst., King Abdulaziz City for Sci. & Technol., Riyadh, Saudi Arabia
Abstract :
Modern Standard Arabic (MSA) is widely used nowadays in Newspapers, books and the World Wide Web with rare use of diacritics. Diacritics, which are symbols placed above or below a letter, change the sound of letters and are used in aiding readers to understand and disambiguate written text. In order to permit automatic processing of Arabic text, many diacritization systems were introduced. In this paper, we evaluate the accuracy of some available diacritization systems using fully diacritized text from the Holy Quran and short poems from the period of the advent of Islam. We also discuss the results of the evaluation.
Keywords :
natural language processing; text analysis; Arabic diacritization systems; Arabic text; Holy Quran; Islam; MSA; World Wide Web; automatic processing; books; diacritics; modern standard Arabic; newspapers; short poems; written text disambiguation; Accuracy; Cities and towns; Computers; Educational institutions; Standards; Syntactics; Text processing; Arabic Text Processing; Automatic Diacritizatio;
Conference_Titel :
Digital Information Management (ICDIM), 2012 Seventh International Conference on
Conference_Location :
Macau
Print_ISBN :
978-1-4673-2428-1
DOI :
10.1109/ICDIM.2012.6360097