DocumentCode :
2753607
Title :
Improving the Accuracy of Question Classification with Machine Learning
Author :
Nguyen, Tri Thanh ; Nguyen, Le Minh ; Shimazu, Akira
Author_Institution :
Sch. of Inf. Sci., Japan Adv. Inst. of Sci. & Technol., Nomi
fYear :
2007
fDate :
5-9 March 2007
Firstpage :
234
Lastpage :
241
Abstract :
Question classification is an important phase in question answering systems. In this paper, we propose to apply i) hierarchical classifiers, ii) hierarchical classifiers in combination with semi-supervised learning and iii) hierarchy expansion for question classification for improving the precision. When the number of classes is large, the performance of classification algorithms may be affected. In order to improve the performance by reducing the number of classes for each classifier, we propose to use hierarchical classifiers according to the question taxonomy, in which each internal node is attached a classifier. We try to use semi-supervised learning to consume unlabeled questions with expectation to improve the performance of classifiers in the hierarchy. We explored different applications of learning methods in for each classifier of the hierarchy: a) supervised learning for all classifiers at all levels; b) semi-supervised learning for the first-level classifier and supervised learning for other classifiers; c) semi-supervised learning for all classifiers. The experiments show that the first method (a) has better results than those of flat classification; the second method (b) produces better results than those of the first method while the effort to increase the performance of fine classifiers in the last method (c) is not so successful. As another effort, we propose to automatically group question classes by clustering in order to expand a node which has a large number of classes in the question taxonomy. The experiment also shows that the overall precision is improved.
Keywords :
classification; information retrieval; learning (artificial intelligence); hierarchical classifiers; machine learning; question answering systems; question classification; question taxonomy; semi-supervised learning; Classification algorithms; Costs; Information retrieval; Information science; Learning systems; Machine learning; Search engines; Semisupervised learning; Supervised learning; Taxonomy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Research, Innovation and Vision for the Future, 2007 IEEE International Conference on
Conference_Location :
Hanoi
Print_ISBN :
1-4244-0694-3
Type :
conf
DOI :
10.1109/RIVF.2007.369162
Filename :
4223079
Link To Document :
بازگشت