DocumentCode
3756104
Title
Combined Classification for Extracting Named Entities from Arabic Texts
Author
F?riel Ben Fraj ;Chiraz Ben Othmane Zribi;Wiem Kouki
Author_Institution
RIADI Lab., La Manouba Univ., Tunisia
fYear
2015
fDate
4/1/2015 12:00:00 AM
Firstpage
55
Lastpage
60
Abstract
In this paper, we describe an approach for extracting named entities from Arabic texts. Arabic language is hard to process since its characteristics that influence, even, the NE extraction. For our case, we consider that the named entities extraction can be assimilated to a typical classification problem. Indeed, this extraction consists of searching for text portions that can be classified in a NE class (Person, Locality or Organization). Thus, we choose to use a supervised learning approach and employ the BIO tagging format that can solve the twin problems of segmentation and categorization. In addition, singular classifier cannot give good results for all types of contexts. Thus, we adopt a set of weighted classifiers which we combined through a voting procedure. In order to appreciate properly the performance of our system, we perform two types of tests: with and without morphological attributes. We consider that the results are highly satisfactory especially with a accuracy that exceeds 89% for both Person and Locality classes.
Keywords
"Organizations","Tagging","Context","Training","Supervised learning","Pragmatics","Robustness"
Publisher
ieee
Conference_Titel
Arabic Computational Linguistics (ACLing), 2015 First International Conference on
Print_ISBN
978-1-4673-9154-2
Type
conf
DOI
10.1109/ACLing.2015.15
Filename
7422280
Link To Document