DocumentCode
464185
Title
Towards Answering How do I Questions Using Classification
Author
Wu, Kelvin ; Yu, Lei ; Cutler, Michal
Author_Institution
Comput. Sci., Binghamton Univ., Vestal, NY
Volume
1
fYear
2007
fDate
21-23 May 2007
Firstpage
266
Lastpage
271
Abstract
Interest in developing open domain question answering systems that leverage the massive amount of knowledge available on the Web is on the rise. In this investigation, we address the problem of answering How do I questions. Our goal is to use the top results obtained from a search engine to extract and present correct answers. Identifying correct answers to such questions is a hard problem that seems to require deep natural language understanding. Fortunately, answers to How do I questions are often procedural, typically containing a successive sequence of actions. Learning to label text as procedural or non-procedural is an easier problem which we attempted to solve by extracting 12 informative features with which we trained classifiers. However, the corpus built from the top documents retrieved for a set of How do I- equivalent queries turned out to be highly imbalanced. To tackle this issue, sampling techniques were used for a variety of classification methods, yielding reasonable recall and precision for the minority class of procedural texts.
Keywords
classification; information retrieval; learning (artificial intelligence); search engines; text analysis; Web document retrieval; correct answer extraction; machine learning technique; open domain question answering system; procedural text classification method; sampling technique; search engine; text labeling; trained classifier; Automotive components; Blogs; Computer science; Feature extraction; Internet; Kelvin; Natural languages; Sampling methods; Search engines; Web search;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Information Networking and Applications Workshops, 2007, AINAW '07. 21st International Conference on
Conference_Location
Niagara Falls, Ont.
Print_ISBN
978-0-7695-2847-2
Type
conf
DOI
10.1109/AINAW.2007.356
Filename
4221071
Link To Document