• DocumentCode
    2379143
  • Title

    Improving Vietnamese POS tagging by integrating a rich feature set and Support Vector Machines

  • Author

    Nghiem, Minh ; Dinh, Dien ; Nguyen, Mai

  • Author_Institution
    Fac. of Inf. Technol., Univ. of Sci., Ho Chi Minh City
  • fYear
    2008
  • fDate
    13-17 July 2008
  • Firstpage
    128
  • Lastpage
    133
  • Abstract
    Part of speech (POS) tagging is fundamental in natural language processing. So far, many methods have been applied for English and the task is well solved. However, there are few studies about this problem for Vietnamese. In this paper, we evaluate common features for English POS tagging and then propose some language specific features for Vietnamese POS tagging. Experimental results on the Vietnamese Lexicography Centerpsilas research grouppsilas corpus show that our POS tagger using this feature set trained by SVM outperforms other Vietnamese POS taggers.
  • Keywords
    natural language processing; speech processing; support vector machines; Vietnamese language; language specific features; natural language processing; part of speech tagging; rich feature set; support vector machines; Feature extraction; Hidden Markov models; Information technology; Machine learning; Natural language processing; Natural languages; Speech processing; Support vector machine classification; Support vector machines; Tagging; Natural Language Processing; Part of Speech Tagging; Support Vector Machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Research, Innovation and Vision for the Future, 2008. RIVF 2008. IEEE International Conference on
  • Conference_Location
    Ho Chi Minh City
  • Print_ISBN
    978-1-4244-2379-8
  • Electronic_ISBN
    978-1-4244-2380-4
  • Type

    conf

  • DOI
    10.1109/RIVF.2008.4586344
  • Filename
    4586344