DocumentCode :
1928698
Title :
Stemming algorithm for different tenses to improve Persian dictionary
Author :
Ghazvini, A. ; Ab Aziz, Mohd Juzaidin
Author_Institution :
Fac. of Inf. Sci. & Technol., Univ. Kebangsaan Malaysia, Bangi, Malaysia
fYear :
2012
fDate :
23-26 Sept. 2012
Firstpage :
50
Lastpage :
53
Abstract :
Persian language is an Indo-European language that is known for its complexity due to the morphology structure. In this paper, we report on Persian stemmer and the impact on improvement of Persian dictionary. Persian language consists of a variety of tenses, while the focus is on past subjunctive, past perfect, continuous past, present perfect and past simple. In Persian language, it is important to get rid of affixes from the verbs to obtain the stem. Therefore, finite state machine has been chosen to develop a Persian stemmer. According to the findings and testing results, Persian stemming algorithm based dictionary is fully accurate for the regular verbs in mentioned tenses.
Keywords :
dictionaries; finite state machines; natural language processing; Indo-European language; Persian dictionary; Persian language; Persian stemmer; Persian stemming algorithm based dictionary; continuous past; finite state machine; morphology structure; past perfect; past simple; past subjunctive; present perfect; tense; verbs; Algorithm; Dictionary; Persian; Stemming;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Industrial Electronics and Applications (ISIEA), 2012 IEEE Symposium on
Conference_Location :
Bandung
Print_ISBN :
978-1-4673-3004-6
Type :
conf
DOI :
10.1109/ISIEA.2012.6496669
Filename :
6496669
Link To Document :
بازگشت