• DocumentCode
    3024289
  • Title

    A Semi-supervised Learning Method for Vietnamese Part-of-Speech Tagging

  • Author

    Le Minh Nguyen ; Xuan, Bach Ngo ; Viet, Cuong Nguyen ; Minh Pham Quang Nhat ; Shimazu, Akira

  • Author_Institution
    Sch. of Inf. Sci., JAIST, Ishikawa, Japan
  • fYear
    2010
  • fDate
    7-9 Oct. 2010
  • Firstpage
    141
  • Lastpage
    146
  • Abstract
    This paper presents a semi-supervised learning method for Vietnamese part of speech tagging. We take into account two powerful tagging models including Conditional Random Fields (CRFs)and the Guided Online-Learning models (GLs) as base learning models. We then propose a semi-supervised learning tagging model for both CRFs and GLs methods. The main idea is to use of a word-cluster model as an associate source for enrich the feature space of discriminate learning models for both training and decoding processes. Experimental results on Vietnamese Tree-bank data (VTB) showed that the proposed method is effective. Our best model achieved accuracy of 94.10% when tested on VTB, and 92.60% an independent test.
  • Keywords
    learning (artificial intelligence); natural language processing; random processes; Vietnamese part-of-speech tagging; Vietnamese tree-bank data; conditional random fields; discriminate learning models; guided online-learning models; semi-supervised learning method; word-cluster model; Clustering algorithms; Context; Modeling; Speech; Tagging; Training; Training data; Conditional Random Fields; Guided Learning; Part of Speech tagging; Semi-Supervised Learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Knowledge and Systems Engineering (KSE), 2010 Second International Conference on
  • Conference_Location
    Hanoi
  • Print_ISBN
    978-1-4244-8334-1
  • Type

    conf

  • DOI
    10.1109/KSE.2010.35
  • Filename
    5632134