DocumentCode :
564877
Title :
Person name extraction from Modern Standard Arabic or Colloquial text
Author :
Zayed, Omnia H. ; El-Beltagy, S.R.
Author_Institution :
Center for Inf. Sci., Nile Univ., Cairo, Egypt
fYear :
2012
fDate :
14-16 May 2012
Abstract :
Person Name extraction from Arabic text is a challenging task. While most existing Arabic texts are written in Modern Standard Arabic Text (MSA) the volume of Arabic Colloquial text is increasing progressively with the wide spread use of social media examples of which are Facebook, Google Moderator and Twitter. Previous work addressed extracting persons´ names from MSA text only and especially from news articles. Previous work also relied on a lot of resources such as gazetteers for places, organizations, verbs, and person names. In this paper we introduce a system for extracting persons´ names from any type of Arabic text whether it is MSA or Colloquial using very few resources. In our system, Natural Language Processing (NLP) is integrated with a limited set of dictionaries to extract a person´s name from Arabic text. The paper also presents the results of evaluating the system on two datasets, one for MSA and the other for Colloquial Arabic. The results achieved were found to be satisfactory in terms of precision, recall and f-measure.
Keywords :
information retrieval; natural language processing; social networking (online); text analysis; text detection; Arabic colloquial text; Facebook; Google; MSA; Moderator; NLP; Twitter; datasets; dictionaries; f-measure; modern standard Arabic text; natural language processing; person name extraction; precision; recall; social media; Computers; Dictionaries; Educational institutions; Grammar; Informatics; Natural language processing; Standards; Modem Standard Arabic; colloquial Arabic; named entity recognition; natural language processing (NLP); social media;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Informatics and Systems (INFOS), 2012 8th International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-4673-0828-1
Type :
conf
Filename :
6236607
Link To Document :
بازگشت