• DocumentCode
    2347940
  • Title

    Patterns of syntactic trees for parsing arabic texts

  • Author

    Ben Fraj, Fériel ; Ben Othmane Zribi, C. ; Ben Ahmed, Mohamed

  • Author_Institution
    RIADI Lab., Manouba Univ., Manouba, Tunisia
  • fYear
    2010
  • fDate
    21-23 Aug. 2010
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    In order to parse Arabic texts, we have chosen to use a machine learning approach. It learns from an Arabic Treebank. The knowledge enclosed in this Treebank is structured as patterns of syntactic trees. These patterns are representative models of syntactic components of the Arabic language. They are not only layered but also both structurally and contextually rich. They serve as an informational source for guiding the parsing process. Our parser is progressive given that it proceeds by treating a sentence into a number of stages, equal to the number of its words. At each step, the parser affects the target word with the most likely patterns to represent it in the context where it is put. Then, it joins the selected patterns with those collected in the previous steps so as to construct the representative syntactic tree(s) of the whole sentence. Preliminary tests have yielded to obtain accuracy and f-score which are respectively equal to 84.78% and 77.52%.
  • Keywords
    computational linguistics; grammars; learning (artificial intelligence); natural language processing; trees (mathematics); Arabic Treebank; Arabic texts; machine learning approach; parsing process; syntactic trees; Context; Syntactics; Tin; Arabic language; Patterns of syntactic trees; combination of patterns; parsing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-6896-6
  • Type

    conf

  • DOI
    10.1109/NLPKE.2010.5587791
  • Filename
    5587791