DocumentCode :
589685
Title :
A practical part-of-speech tagger for Bengali
Author :
Sarkar, Kamal ; Gayen, Vivekananda
Author_Institution :
Dept. of Comput. Sci. & Eng., Jadavpur Univ., Kolkata, India
fYear :
2012
fDate :
Nov. 30 2012-Dec. 1 2012
Firstpage :
36
Lastpage :
40
Abstract :
This paper presents a practical part-of-speech (POS) tagger for Bengali, which will accept a raw Bengali text (typed in Bengali font) to produce a Bengali POS tagged output which can be directly used for other NLP applications. We have implemented a supervised Bengali trigram POS Tagger from the scratch using a statistical machine learning technique that uses the second order Hidden Markov Model (HMM). We have considered the bigram POS tagger as the baseline tagger to which our developed trigram POS tagger has been compared.
Keywords :
hidden Markov models; learning (artificial intelligence); natural language processing; statistical analysis; text analysis; Bengali font; Bengali text; HMM; Hidden Markov Model; NLP applications; practical part-of-speech tagger; statistical machine learning technique; supervised Bengali trigram POS Tagger; Equations; Hidden Markov models; Natural language processing; Speech; Tagging; Training; Viterbi algorithm; Bengali Language; Part-of-speech tagging; Second order hidden markov model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Emerging Applications of Information Technology (EAIT), 2012 Third International Conference on
Conference_Location :
Kolkata
Print_ISBN :
978-1-4673-1828-0
Type :
conf
DOI :
10.1109/EAIT.2012.6407856
Filename :
6407856
Link To Document :
بازگشت