DocumentCode :
2347940
Title :
Patterns of syntactic trees for parsing arabic texts
Author :
Ben Fraj, Fériel ; Ben Othmane Zribi, C. ; Ben Ahmed, Mohamed
Author_Institution :
RIADI Lab., Manouba Univ., Manouba, Tunisia
fYear :
2010
fDate :
21-23 Aug. 2010
Firstpage :
1
Lastpage :
8
Abstract :
In order to parse Arabic texts, we have chosen to use a machine learning approach. It learns from an Arabic Treebank. The knowledge enclosed in this Treebank is structured as patterns of syntactic trees. These patterns are representative models of syntactic components of the Arabic language. They are not only layered but also both structurally and contextually rich. They serve as an informational source for guiding the parsing process. Our parser is progressive given that it proceeds by treating a sentence into a number of stages, equal to the number of its words. At each step, the parser affects the target word with the most likely patterns to represent it in the context where it is put. Then, it joins the selected patterns with those collected in the previous steps so as to construct the representative syntactic tree(s) of the whole sentence. Preliminary tests have yielded to obtain accuracy and f-score which are respectively equal to 84.78% and 77.52%.
Keywords :
computational linguistics; grammars; learning (artificial intelligence); natural language processing; trees (mathematics); Arabic Treebank; Arabic texts; machine learning approach; parsing process; syntactic trees; Context; Syntactics; Tin; Arabic language; Patterns of syntactic trees; combination of patterns; parsing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-6896-6
Type :
conf
DOI :
10.1109/NLPKE.2010.5587791
Filename :
5587791
Link To Document :
بازگشت