• DocumentCode
    2088636
  • Title

    Chinese Noun Phrases Chunking: A Latent Discriminative Model with Global Features

  • Author

    Sun, Xiao ; Nan, Xiaoli

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Dalian Nat. Univ., Dalian, China
  • fYear
    2011
  • fDate
    24-26 Aug. 2011
  • Firstpage
    167
  • Lastpage
    172
  • Abstract
    In the fields of Chinese natural language processing, recognizing simple and non-recursive base phrases is an important task for natural language processing applications, such as information processing and machine translation. In stead of rule-based model, we adopt the statistical machine learning method, newly proposed Latent semi-CRF model to solve the Chinese noun phrase chunking problem. The Chinese base phrases could be treated as the sequence labeling problem, which involve the prediction of a class label for each frame in an unsegmented sequence. The Chinese noun phrases have sub-structures which could not be observed in training data. We propose a latent discriminative model called Latent semi-CRF(Latent Semi Conditional Random Fields), which incorporates the advantages of LDCRF(Latent Dynamic Conditional Random Fields) and semi-CRF that model the sub-structure of a class sequence and learn dynamics between class labels, in detecting the Chinese noun phrases. Our results demonstrate that the latent dynamic discriminative model compares favorably to Support Vector Machines, Maximum Entropy Model, and Conditional Random Fields(including LDCRF and semi-CRF) on Chinese noun phrases chunking.
  • Keywords
    learning (artificial intelligence); natural language processing; statistical analysis; Chinese natural language processing; Chinese noun phrase chunking; information processing task; latent discriminative model; latent dynamic conditional random field; latent semi-CRF model; machine translation task; maximum entropy model; semi-conditional random fields; sequence labeling problem; statistical machine learning method; support vector machines; Equations; Hidden Markov models; Inference algorithms; Magnetic heads; Mathematical model; Syntactics; Training; Chinese Noun Phrases Chunking; Global Features; Latent Discriminative Model; Natural Language Processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Science and Engineering (CSE), 2011 IEEE 14th International Conference on
  • Conference_Location
    Dalian, Liaoning
  • Print_ISBN
    978-1-4577-0974-6
  • Type

    conf

  • DOI
    10.1109/CSE.2011.40
  • Filename
    6062869