DocumentCode
515416
Title
A Dependency Treebank of the Quran using traditional Arabic grammar
Author
Dukes, Kais ; Buckwalter, Tim
Author_Institution
Sch. of Comput., Univ. of Leeds, Leeds, UK
fYear
2010
fDate
28-30 March 2010
Firstpage
1
Lastpage
7
Abstract
The Quran is a significant religious text, followed by the 1.5 billion believers of the Islamic faith worldwide. The text dates to 610-632 CE and is written in Quranic Arabic, the direct ancestor language of modern standard Arabic in use today. This paper presents the Quranic Arabic Dependency Treebank (QADT) and reports on the approaches and solutions used to apply Natural Language Processing to the unique and challenging language of the Quran. This project differs from other Arabic treebanks by providing a deep computational linguistic model based on historical traditional Arabic grammar. The treebank is part of the Quranic Arabic Corpus (http://corpus.quran.com), a popular free Arabic resource developed at the University of Leeds. Motivated by the importance of the Quran as a central religious text, we also report on how online collaborative annotation was used to bring together Quranic scholars and Arabic language experts to ensure a high level of accuracy for grammatical analysis of the entire Quran.
Keywords
grammars; natural language processing; Arabic grammar; Quran dependency treebank; grammatical analysis; natural language processing; Computational linguistics; Computational modeling; Educational institutions; Morphology; Natural language processing; Online Communities/Technical Collaboration; Performance analysis; Spatial databases; Tagging; Tree graphs; Arabic; Corpus; Dependency Grammar; Morphology; Part-of-Speech Tagging; Quran; Treebank Syntax;
fLanguage
English
Publisher
ieee
Conference_Titel
Informatics and Systems (INFOS), 2010 The 7th International Conference on
Conference_Location
Cairo
Print_ISBN
978-1-4244-5828-8
Type
conf
Filename
5461810
Link To Document