• DocumentCode
    2527692
  • Title

    A System for Evaluation of Arabic Root Extraction Methods

  • Author

    El Salam Al Hajjar, Abd ; Hajjar, Mohammad ; Zreik, Khaldoun

  • Author_Institution
    Paragraph Lab., Univ. of Paris 8, Vincennes-Saint-Denis, France
  • fYear
    2010
  • fDate
    9-15 May 2010
  • Firstpage
    506
  • Lastpage
    512
  • Abstract
    In this article, we present a new application that evaluated the performance of a number of the Arabic root extraction methods. The implemented methods in this system are selected according to a previous classification, where these methods are classified into five categories. We have selected a method for each category. These methods are: Light Stemmer, Arabic Stemming without a root dictionary, MT-based Arabic Stemmer, N-gram based on similarity coefficient and N-gram based on dissimilarity coefficient. This evaluation was conducted on the same terms in a corpus of two thousand words and their roots. These words are taken from Arabic dictionary "Lesan Al-Arab". This application has allowed us to have a first original comparison between the evaluated methods. This system works in two ways: normal and automatic.
  • Keywords
    information retrieval; languages; Arabic root extraction methods; Arabic stemming; N-gram based on dissimilarity coefficient; N-gram based on similarity coefficient; light stemmer; root dictionary; Data mining; Dictionaries; Information retrieval; Laboratories; Tagging; Testing; Vocabulary; Web and internet services; Arabic language; Dictionary; Evaluation; Information extraction; N-gram; Stemmer;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Internet and Web Applications and Services (ICIW), 2010 Fifth International Conference on
  • Conference_Location
    Barcelona
  • Print_ISBN
    978-1-4244-6728-0
  • Type

    conf

  • DOI
    10.1109/ICIW.2010.98
  • Filename
    5476492