DocumentCode :
3756099
Title :
Building a Corpus for Arabic Dialects Using Games with a Purpose
Author :
Maya Osman;Caroline Sabty;Nada Sharaf;Slim Abdennadher
Author_Institution :
Dept. in the German, Univ. in Cairo, Cairo, Egypt
fYear :
2015
fDate :
4/1/2015 12:00:00 AM
Firstpage :
21
Lastpage :
25
Abstract :
There is a huge gap between the written form of Arabic, Modern Standard Arabic (MSA), and the different spoken Arabic dialects due to the big number of dialects. In addition, most Arabic data-sets are formed for MSA content. Traditional ways of identifying dialects of texts are time and money consuming. In addition, due to the morphological complexity of Arabic, the gender of the speaker may change structure of an Arabic sentence. Thus, dialects hold rich information (such as the origin of the speaker and the gender of the addressee). A Game With A Purpose (GWAP) called "3ammeya" is implemented to identify the dialects of Arabic sentences along with their MSA translations. Moreover, through the game, the gender of the speaker addressee are classified. The collected data will help construct an expandable and cheap corpus for dialect identification and translation to MSA.
Keywords :
"Games","Pragmatics","Standards","Computers","Africa","Bridges","Crowdsourcing"
Publisher :
ieee
Conference_Titel :
Arabic Computational Linguistics (ACLing), 2015 First International Conference on
Print_ISBN :
978-1-4673-9154-2
Type :
conf
DOI :
10.1109/ACLing.2015.10
Filename :
7422275
Link To Document :
بازگشت