• DocumentCode
    564874
  • Title

    A comparative study on Arabic POS tagging using Quran corpus

  • Author

    Alashqar, Abdelkareem M.

  • Author_Institution
    Fac. of Inf. Technol., Islamic Univ. of Gaza, Gaza, Palestinian Authority
  • fYear
    2012
  • fDate
    14-16 May 2012
  • Abstract
    POS tagging is the process of computationally assigning correct part of speech to each word of a given input text depending on the context. Different POS tagging techniques in the literature have been developed and experimented mostly for English language. Some of the same work has been done for Arabic language. Comparative studies on POS tagging for Arabic language are relatively unexplored. In this paper we compare the performance of some POS tagging techniques for Arabic using Quran corpus. These techniques include N-Gram, Brill, HMM, and TnT taggers. The comparison experiments have been done on diacritized and undiacritized classical Arabic. We tried to see which technique maximizes the performance with our case.
  • Keywords
    grammars; identification technology; natural languages; text analysis; Arabic POS tagging techniques; Arabic language; Brill; English language; HMM; N-gram; Quran corpus; TnT taggers; part of speech; Accuracy; Context; Hidden Markov models; Informatics; Natural language processing; Speech; Tagging; NLP; Natural Language Processing; POS; Part of Speech Tagging; Quran Corpus;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Informatics and Systems (INFOS), 2012 8th International Conference on
  • Conference_Location
    Cairo
  • Print_ISBN
    978-1-4673-0828-1
  • Type

    conf

  • Filename
    6236604