DocumentCode
258754
Title
Design of a POS tagger using conditional random fields for Malayalam
Author
Krishnapriya, V. ; Sreesha, P. ; Harithalakshmi, T.R. ; Archana, T.C. ; Vettath, Jayasree N.
Author_Institution
Dept. of Comput. Sci. & Eng., Sreepathy Inst. of Manage. & Technol., Palakkad, India
fYear
2014
fDate
17-18 Dec. 2014
Firstpage
370
Lastpage
373
Abstract
Parts of Speech tagging, is a process of marking the words in a text as corresponding to a particular part of speech, based on its definition and context POS tagger plays an important role in Natural language applications like speech recognition, natural language parsing, information retrieval and extraction. This paper discusses architecture for designing a Part-Of-Speech (POS tagger for Malayalam language using Conditional Random Field (CRF). The experiments presented in this paper use an annotated corpus of 1028 sentences (11,315 words) and tagset consists of 100 tags. A trigram based tagging scheme is involved in the experiments. The proposed system is based on an empirical approach that models the human POS tagging processing more realistically than the existing systems, without compromising the efficiency and accuracy.
Keywords
information retrieval; natural language processing; speech processing; CRF; Malayalam language; POS tagger design; annotated corpus; conditional random field; conditional random fields; information extraction; information retrieval; natural language applications; natural language parsing; speech recognition; speech tagging; text words; Accuracy; Hidden Markov models; Speech; Speech processing; Support vector machines; Tagging; Training; CRF; Hidden Markov Model; Malayalam; POS tagging; Stochastic process;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Systems and Communications (ICCSC), 2014 First International Conference on
Conference_Location
Trivandrum
Print_ISBN
978-1-4799-6012-5
Type
conf
DOI
10.1109/COMPSC.2014.7032680
Filename
7032680
Link To Document