• DocumentCode
    3758116
  • Title

    N-scheme model: An approach towards reducing Arabic language sparseness

  • Author

    Mohamed Achraf Ben Mohamed;Sarra Zrigui;Anis Zouaghi;Mounir Zrigui

  • Author_Institution
    Faculty of Sciences of Monastir, Tunisia
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    In addition to traditional characteristics of natural languages like implicitly or ambiguity or imprecision, Arabic is known by its sparseness which explains the difficulty of its automatic processing. But on the other hand, Arabic language is characterized by an interesting property; lemmas are generated by derivation based on roots and schemes. Schemes are kinds of molds allowing changing the form of root by actions involving elongation, or repetition, or even adding characters. Schemes can also give meaning to generated word. In this work we have studied the statistical characteristics of the Arabic language at the level of schemes; we have emphasized the attenuation of the sparseness at this level. Then we explored the possibility of building natural language processing tools for Arabic by relying on schemes. We discovered that schemes have great potential in building accurate natural language processing tools for Arabic. Based entirely or partially on schemes we built an n-scheme statistical model and a text classification system.
  • Keywords
    "Natural language processing","Silicon","Buildings","Computational modeling","Neural networks","Vocabulary","Training"
  • Publisher
    ieee
  • Conference_Titel
    Information & Communication Technology and Accessibility (ICTA), 2015 5th International Conference on
  • Type

    conf

  • DOI
    10.1109/ICTA.2015.7426895
  • Filename
    7426895