Title :
Stemming algorithm for different tenses to improve Persian dictionary
Author :
Ghazvini, A. ; Ab Aziz, Mohd Juzaidin
Author_Institution :
Fac. of Inf. Sci. & Technol., Univ. Kebangsaan Malaysia, Bangi, Malaysia
Abstract :
Persian language is an Indo-European language that is known for its complexity due to the morphology structure. In this paper, we report on Persian stemmer and the impact on improvement of Persian dictionary. Persian language consists of a variety of tenses, while the focus is on past subjunctive, past perfect, continuous past, present perfect and past simple. In Persian language, it is important to get rid of affixes from the verbs to obtain the stem. Therefore, finite state machine has been chosen to develop a Persian stemmer. According to the findings and testing results, Persian stemming algorithm based dictionary is fully accurate for the regular verbs in mentioned tenses.
Keywords :
dictionaries; finite state machines; natural language processing; Indo-European language; Persian dictionary; Persian language; Persian stemmer; Persian stemming algorithm based dictionary; continuous past; finite state machine; morphology structure; past perfect; past simple; past subjunctive; present perfect; tense; verbs; Algorithm; Dictionary; Persian; Stemming;
Conference_Titel :
Industrial Electronics and Applications (ISIEA), 2012 IEEE Symposium on
Conference_Location :
Bandung
Print_ISBN :
978-1-4673-3004-6
DOI :
10.1109/ISIEA.2012.6496669