• DocumentCode
    619641
  • Title

    The impact of arabic inter-character proximity and similarity on spell-checking

  • Author

    Gueddah, Hicham ; Yousfi, Abdallah

  • Author_Institution
    SIME Lab. ENSIAS, Univ. of Mohammed V-Souissi, Rabat, Morocco
  • fYear
    2013
  • fDate
    8-9 May 2013
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Following a statistical study carried out on the typographical errors committed when typing documents in Arabic language, it was found that most of these typos are character permutation errors, accounting for 65% of overall errors. Similarly, in the aim to overcome this situation and while analyzing such errors, it turned out that character permutation resulting in a misspelled word can be imputed either to character proximity on an Arabic keyboard or to calligraphic similarity between such Arabic characters. In order to remedy this problem, we suggest, in this article, that a measurement of proximity and similarity between Arabic characters be integrated into Levenshtein algorithm, in the aim of enhancing the suggestions and the scheduling of the solutions returned by the spelling correction. The experimental outcomes are very satisfactory and attest of the necessity of integrating inter-character proximity and similarity measures for Arabic within spell-checking systems.
  • Keywords
    natural language processing; statistical analysis; Arabic intercharacter proximity; Arabic keyboard character; Arabic language; Levenshtein algorithm; calligraphic similarity; character permutation errors; similarity measures; spell-checking systems; statistical study; Dictionaries; Educational institutions; Keyboards; Measurement uncertainty; Natural language processing; Scheduling; Training; Arabic Language; Levenshtein distance; Measurement; Permutation Error; Proximity; Similarity; Spelling Correction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems: Theories and Applications (SITA), 2013 8th International Conference on
  • Conference_Location
    Rabat
  • Print_ISBN
    978-1-4799-0297-2
  • Type

    conf

  • DOI
    10.1109/SITA.2013.6560811
  • Filename
    6560811