• DocumentCode
    3779364
  • Title

    The filtered combination of the weighted edit distance and the Jaro-Winkler distance to improve spellchecking Arabic texts

  • Author

    Hicham Gueddah;Abdellah Yousfi;Mostafa Belkasmi

  • Author_Institution
    Telecom and Embedded Systems Team, SIME Lab ENSIAS, University Mohammed V of Rabat, Morocco
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    The digital environments for human learning have been much evolving thanks to the incredible progress of information technologies. This is particularly the case for automatic correction of spelling errors requested by a large majority of people nowadays. Almost all of the current spellcheckers are semiautomatic, and they enable users to find the good solution for a committed error. The major shortcoming of the existing metric methods of correction lies in the bad scheduling of the solutions suggested to the spellchecking out of context of a detected error. To overcome this limitation, we have developed several approaches which suggest probability costs estimated from a learning test. It is attributed in various editing operations during calculating measure of similarity, case of the edit distance. The idea developed in this work was to know how to efficiently weigh these editing operations without resorting to a phase of learning. This is based only on the proximity and the similarity between Arabic keyboard keys. Additionally, we have suggested combining this measure with the distance of Jaro-Winkler in order to better filter, refine and weigh certain solutions compared to others. The experimental results stem from tests conducted on errors committed in a learning corpus, trying to validate the choices of conception and to prove the interest of both approaches.
  • Keywords
    "Keyboards","Training","Weight measurement","Dictionaries","Telecommunications","Embedded systems"
  • Publisher
    ieee
  • Conference_Titel
    Computer Systems and Applications (AICCSA), 2015 IEEE/ACS 12th International Conference of
  • Electronic_ISBN
    2161-5330
  • Type

    conf

  • DOI
    10.1109/AICCSA.2015.7507128
  • Filename
    7507128