DocumentCode :
131296
Title :
Improving Persian POS tagging using the maximum entropy model
Author :
Kardan, Ahmad ; Imani, Maryam Bahojb
Author_Institution :
Dept. of Comput. Eng. & Inf. Technol., Amirkabir Univ. of Technol., Tehran, Iran
fYear :
2014
fDate :
4-6 Feb. 2014
Firstpage :
1
Lastpage :
5
Abstract :
Part of Speech (POS) tagging is one of the fundamental steps in various speech and text processing applications. POS tagging is the process of assigning the words in input sentences with their categories according to their contextual and grammatical properties. In addition to the general POS tagging difficulties such as the disambiguation of multi-category words and unknown words, the Persian language, unlike the English language, is a free order language and it has its own characteristics. These challenges can greatly affect the quality of the part-of-speech tagging process. An efficient POS tagging process has been developed for some languages, especially for the English language, but just a few researches have been done on the Persian language. To address these issues and achieve high POS tagging accuracy, we chose features which can show the important characteristics of words in a sentence, as well as maximum entropy as a machine learning classifier. Experimental results show that the proposed Persian POS tagging system outperforms the other state-of-the-art Persian taggers.
Keywords :
learning (artificial intelligence); maximum entropy methods; natural language processing; pattern classification; speech processing; text analysis; English language; Persian POS tagging improvement; Persian language; contextual properties; free-order language; grammatical properties; input sentences; machine learning classifier; maximum entropy model; multicategory word disambiguation; part-of-speech tagging; part-of-speech tagging process quality; speech processing; text processing; unknown word characteristics; word assignment; word categories; Accuracy; Entropy; Feature extraction; Hidden Markov models; Speech; Speech processing; Tagging; Maximum Entropy; Natural Language Processing; Part of Speech Tagging;Persian Part of Speech Tagging;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems (ICIS), 2014 Iranian Conference on
Conference_Location :
Bam
Print_ISBN :
978-1-4799-3350-1
Type :
conf
DOI :
10.1109/IranianCIS.2014.6802567
Filename :
6802567
Link To Document :
بازگشت