Title :
Patterns of syntactic trees for parsing arabic texts
Author :
Ben Fraj, Fériel ; Ben Othmane Zribi, C. ; Ben Ahmed, Mohamed
Author_Institution :
RIADI Lab., Manouba Univ., Manouba, Tunisia
Abstract :
In order to parse Arabic texts, we have chosen to use a machine learning approach. It learns from an Arabic Treebank. The knowledge enclosed in this Treebank is structured as patterns of syntactic trees. These patterns are representative models of syntactic components of the Arabic language. They are not only layered but also both structurally and contextually rich. They serve as an informational source for guiding the parsing process. Our parser is progressive given that it proceeds by treating a sentence into a number of stages, equal to the number of its words. At each step, the parser affects the target word with the most likely patterns to represent it in the context where it is put. Then, it joins the selected patterns with those collected in the previous steps so as to construct the representative syntactic tree(s) of the whole sentence. Preliminary tests have yielded to obtain accuracy and f-score which are respectively equal to 84.78% and 77.52%.
Keywords :
computational linguistics; grammars; learning (artificial intelligence); natural language processing; trees (mathematics); Arabic Treebank; Arabic texts; machine learning approach; parsing process; syntactic trees; Context; Syntactics; Tin; Arabic language; Patterns of syntactic trees; combination of patterns; parsing;
Conference_Titel :
Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-6896-6
DOI :
10.1109/NLPKE.2010.5587791