• DocumentCode
    258754
  • Title

    Design of a POS tagger using conditional random fields for Malayalam

  • Author

    Krishnapriya, V. ; Sreesha, P. ; Harithalakshmi, T.R. ; Archana, T.C. ; Vettath, Jayasree N.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Sreepathy Inst. of Manage. & Technol., Palakkad, India
  • fYear
    2014
  • fDate
    17-18 Dec. 2014
  • Firstpage
    370
  • Lastpage
    373
  • Abstract
    Parts of Speech tagging, is a process of marking the words in a text as corresponding to a particular part of speech, based on its definition and context POS tagger plays an important role in Natural language applications like speech recognition, natural language parsing, information retrieval and extraction. This paper discusses architecture for designing a Part-Of-Speech (POS tagger for Malayalam language using Conditional Random Field (CRF). The experiments presented in this paper use an annotated corpus of 1028 sentences (11,315 words) and tagset consists of 100 tags. A trigram based tagging scheme is involved in the experiments. The proposed system is based on an empirical approach that models the human POS tagging processing more realistically than the existing systems, without compromising the efficiency and accuracy.
  • Keywords
    information retrieval; natural language processing; speech processing; CRF; Malayalam language; POS tagger design; annotated corpus; conditional random field; conditional random fields; information extraction; information retrieval; natural language applications; natural language parsing; speech recognition; speech tagging; text words; Accuracy; Hidden Markov models; Speech; Speech processing; Support vector machines; Tagging; Training; CRF; Hidden Markov Model; Malayalam; POS tagging; Stochastic process;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Systems and Communications (ICCSC), 2014 First International Conference on
  • Conference_Location
    Trivandrum
  • Print_ISBN
    978-1-4799-6012-5
  • Type

    conf

  • DOI
    10.1109/COMPSC.2014.7032680
  • Filename
    7032680