• DocumentCode
    2932499
  • Title

    Chinese Subcategorization Annotation Based on Machine Learning

  • Author

    Han, Xiwu

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Heilongjiang Univ., Harbin, China
  • fYear
    2009
  • fDate
    24-26 Nov. 2009
  • Firstpage
    1503
  • Lastpage
    1508
  • Abstract
    There have been a lot of researches focusing on large-scaled automatic acquisition of subcategorization frames, and many achievements have been made for lexicon building in quite a few languages, but subcategorization annotation for individual sentences still remains in a rarely touched field. This paper proposed to annotate Chinese subcategorization as a classification task by means of sequence kernel methods, which exploited the potential relations among the respective sentential constituents. Our final classification with word sequence kernel congregation and POS sequence kernel C-SVM achieved a very promising accuracy ratio of 92.36% on the testing set, which is 13.51% higher than the baseline performance of the existing Chinese SCF hypothesis generator.
  • Keywords
    learning (artificial intelligence); natural language processing; pattern classification; support vector machines; Chinese subcategorization annotation; POS sequence kernel C-SVM; classification task; individual sentences; lexicon; machine learning; sequence kernel methods; subcategorization frame acquisition; word sequence kernel congregation; Computer science; Filtering; Information technology; Kernel; Machine learning; Maximum likelihood estimation; Natural language processing; Natural languages; Statistical analysis; Testing; annotation; machine learning; subcategorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Sciences and Convergence Information Technology, 2009. ICCIT '09. Fourth International Conference on
  • Conference_Location
    Seoul
  • Print_ISBN
    978-1-4244-5244-6
  • Electronic_ISBN
    978-0-7695-3896-9
  • Type

    conf

  • DOI
    10.1109/ICCIT.2009.36
  • Filename
    5370317