DocumentCode
564874
Title
A comparative study on Arabic POS tagging using Quran corpus
Author
Alashqar, Abdelkareem M.
Author_Institution
Fac. of Inf. Technol., Islamic Univ. of Gaza, Gaza, Palestinian Authority
fYear
2012
fDate
14-16 May 2012
Abstract
POS tagging is the process of computationally assigning correct part of speech to each word of a given input text depending on the context. Different POS tagging techniques in the literature have been developed and experimented mostly for English language. Some of the same work has been done for Arabic language. Comparative studies on POS tagging for Arabic language are relatively unexplored. In this paper we compare the performance of some POS tagging techniques for Arabic using Quran corpus. These techniques include N-Gram, Brill, HMM, and TnT taggers. The comparison experiments have been done on diacritized and undiacritized classical Arabic. We tried to see which technique maximizes the performance with our case.
Keywords
grammars; identification technology; natural languages; text analysis; Arabic POS tagging techniques; Arabic language; Brill; English language; HMM; N-gram; Quran corpus; TnT taggers; part of speech; Accuracy; Context; Hidden Markov models; Informatics; Natural language processing; Speech; Tagging; NLP; Natural Language Processing; POS; Part of Speech Tagging; Quran Corpus;
fLanguage
English
Publisher
ieee
Conference_Titel
Informatics and Systems (INFOS), 2012 8th International Conference on
Conference_Location
Cairo
Print_ISBN
978-1-4673-0828-1
Type
conf
Filename
6236604
Link To Document