Title :
The impact of arabic inter-character proximity and similarity on spell-checking
Author :
Gueddah, Hicham ; Yousfi, Abdallah
Author_Institution :
SIME Lab. ENSIAS, Univ. of Mohammed V-Souissi, Rabat, Morocco
Abstract :
Following a statistical study carried out on the typographical errors committed when typing documents in Arabic language, it was found that most of these typos are character permutation errors, accounting for 65% of overall errors. Similarly, in the aim to overcome this situation and while analyzing such errors, it turned out that character permutation resulting in a misspelled word can be imputed either to character proximity on an Arabic keyboard or to calligraphic similarity between such Arabic characters. In order to remedy this problem, we suggest, in this article, that a measurement of proximity and similarity between Arabic characters be integrated into Levenshtein algorithm, in the aim of enhancing the suggestions and the scheduling of the solutions returned by the spelling correction. The experimental outcomes are very satisfactory and attest of the necessity of integrating inter-character proximity and similarity measures for Arabic within spell-checking systems.
Keywords :
natural language processing; statistical analysis; Arabic intercharacter proximity; Arabic keyboard character; Arabic language; Levenshtein algorithm; calligraphic similarity; character permutation errors; similarity measures; spell-checking systems; statistical study; Dictionaries; Educational institutions; Keyboards; Measurement uncertainty; Natural language processing; Scheduling; Training; Arabic Language; Levenshtein distance; Measurement; Permutation Error; Proximity; Similarity; Spelling Correction;
Conference_Titel :
Intelligent Systems: Theories and Applications (SITA), 2013 8th International Conference on
Conference_Location :
Rabat
Print_ISBN :
978-1-4799-0297-2
DOI :
10.1109/SITA.2013.6560811