DocumentCode :
234363
Title :
A new method to construct a statistical model for Arabic language
Author :
Sadiqui, Ali ; Zinedine, Ahmed
Author_Institution :
Fac. of Sci. Dhar El Mahrez, Sidi Mohamed Ben Abdellah Univ., Atlas, Morocco
fYear :
2014
fDate :
20-22 Oct. 2014
Firstpage :
296
Lastpage :
299
Abstract :
Language models are one of the key components in modern systems of automatic language processing. In this study we present a new approach for the realization of a statistical model of Arabic language for non-vocalized texts. This approach allows to overcome the morphological complexity of the Arabic language and to address the limitations of existing morphological analyzers. Indeed the classic approach adopted by most of the morphological analyzers, bring the word out of its context and therefore generate several options for segmentation. Our solution proposes using trellises at a time to keep the possibilities of segmentation generated by the morphological analyzer and then create the model language. In order to realize this solution, we have used these tools: AraMorph and Lattice-Tool from the box SRILM and AT & WSF. The language was estimated from a corpus composed of 100 K words and has been tested on a corpus of 7 K words. The results and analysis are presented in this document.
Keywords :
computational linguistics; natural language processing; statistical analysis; text analysis; Arabic language processing; language model; morphological analyzer; nonvocalized text; statistical model; Analytical models; Complexity theory; Context; Decision support systems; Arabic Laguage Model; Automatic Arabic Language processing; Non-vocalized text; Statistical Model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science and Technology (CIST), 2014 Third IEEE International Colloquium in
Conference_Location :
Tetouan
Print_ISBN :
978-1-4799-5978-5
Type :
conf
DOI :
10.1109/CIST.2014.7016635
Filename :
7016635
Link To Document :
بازگشت