Title :
Formalized answer extraction technology based on pattern learning
Author :
Peng, Li ; Wen-Da, Teng ; Wei, Zheng ; Kai-Hui, Zhang
Author_Institution :
Coll. of Comput. Sci. & Technol., Harbin Univ. of Sci. & Technol. (HUST), Harbin, China
Abstract :
Open-domain Question Answering System is an interesting and challenging subject to research in the current field of natural language processing. The difference between QA system and the traditional text retrieval lies in the answer extraction module, which realizes the accurate answer extraction. The answer extraction on the basis of pattern matching is an efficient strategy, which focuses on displaying answers through formalized pattern. The most significant goal of the formalized answer extraction on basis of pattern matching strategy is to establish a complete pattern knowledge database. The automatic construction of formalized pattern for answer extraction is the future tendency of formalized extraction. Unfortunately, formalized answer extraction is still less effective than the extraction method based on statistical learning. This paper analyzes the following subjects: 1. Low coverage of questions. 2. Unreliability of pattern tag. 3. Difficulty in the assessment of pattern confidence. 4. Low level of pattern generalization. Based on the above four subjects, this thesis attempts to automatically construct pattern knowledge database through pattern learning and question sorting architecture based on answer types, use reliable pattern tag to process the formalization of pattern and dramatically increase coverage and accuracy. Furthermore, assess the pattern confidence in terms of coverage and accuracy. Finally, it will put forward pattern generalization technology based on the principle of unchanged pattern major elements, which observably enhances the pattern generalizing performance. Experimental results show that the average coverage of this paper reaches 57.2%, the average accuracy reaches 46.2%; major question accuracy is 80.8% and generalization technology increases accuracy nearly 6%. In sum, this paper realizes high extraction accuracy with simple methods. Especially in the issue of pattern matching on the case, can achieve high extraction accuracy.
Keywords :
learning (artificial intelligence); natural language processing; question answering (information retrieval); statistical analysis; text analysis; formalized answer extraction technology; natural language processing; open-domain question answering system; pattern knowledge database; pattern learning; pattern matching; statistical learning; text retrieval; Accuracy; Bismuth; Feature extraction; Manuals; Organizations; Training; Answer Extraction; Pattern Learning; Question Answering;
Conference_Titel :
Strategic Technology (IFOST), 2010 International Forum on
Conference_Location :
Ulsan
Print_ISBN :
978-1-4244-9038-7
Electronic_ISBN :
978-1-4244-9036-3
DOI :
10.1109/IFOST.2010.5667981