Exploiting syntactic and semantic information in coarse chinese question classification

Author

Kang, Xin ; Wang, Xiaojie ; Ren, Fuji

Author_Institution

Beijing Univ. of Posts & Telecommun., Beijing

fYear

2008

fDate

19-22 Oct. 2008

Firstpage

1

Lastpage

7

Abstract

Recent years have seen great process in studying English question classification. In our research, we learn Chinese question classification by exploiting the result of lexical, syntactic and semantic parsing on question sentences. Support vector machines are adopted to train a classifier on 6 coarse categories using single and combination of different parsing results as features. We find that even the surface information such as words and parts of speech could lead to a satisfying result, while augmenting the classifier with syntactic and semantic features could give even higher precision. However, the lack of words and incomplete syntactic structures among most questions cause combination of features even sparser than single features in the feature space, with much side effect brought to the performance of Chinese question classification.

Keywords

classification; grammars; natural language processing; support vector machines; text analysis; Chinese question classification; lexical parsing; parts of speech; question sentence; semantic information; semantic parsing; support vector machine; syntactic information; syntactic parsing; Classification tree analysis; Data mining; Filters; Information retrieval; Learning systems; Natural languages; Search engines; Speech; Support vector machine classification; Support vector machines;

fLanguage

English

Publisher

ieee

Conference_Titel

Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on

Conference_Location

Beijing

Print_ISBN

978-1-4244-4515-8

Electronic_ISBN

978-1-4244-2780-2

Type

conf

DOI

10.1109/NLPKE.2008.4906803

Filename

4906803