DocumentCode :
3102102
Title :
A new stemming algorithm to extract quadri-literal Arabic roots
Author :
Kanaan, Ghassan ; Al-Shalabi, Riyad ; Jaam, Jihad M. ; Al-Kabi, Mohammed Naji ; Hasnah, Ahmad
Author_Institution :
Comput. Inf. Syst. Dept., Yarmouk Univ., Irbid, Jordan
fYear :
2004
fDate :
19-23 April 2004
Firstpage :
543
Abstract :
Summary form only given. We present a new stemming algorithm to extract quadri-literal Arabic roots. The algorithm starts by excluding the prefixes and checks then the word characters starting from the last letter backward to the first one. A temporary matrix is used to store the suffix letters of the Arabic word, and another matrix is used to store the roots. The partition process is preceded by removing the particle from the source word. Checking the letters of any word includes checking whether the tested letter is included within the general standard Arabic word; if the test is positive then the letter will be stored in the temporary matrix, otherwise it will be stored in the root matrix. Mutation of some of the original letters in the word to be derived is used in some cases in order to store the substitute letters in the root matrix. Finally, the letters in the root matrix are arranged according to their order in the original word. The algorithm has been tested on a sample of 200 words generated randomly and descendant from quadri-literal Arabic verbs. It has shown a high performance reached 95% of accuracy rate.
Keywords :
natural languages; text analysis; word processing; Arabic word; quadri-literal Arabic roots; root matrix; stemming algorithm; word characters extraction; Computer science; Data mining; Genetic mutations; Information systems; Matrices; Partitioning algorithms; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information and Communication Technologies: From Theory to Applications, 2004. Proceedings. 2004 International Conference on
Print_ISBN :
0-7803-8482-2
Type :
conf
DOI :
10.1109/ICTTA.2004.1307872
Filename :
1307872
Link To Document :
بازگشت