DocumentCode :
2225725
Title :
Mining answers in German Web pages
Author :
Neumann, Günter ; Xu, Feiyu
Author_Institution :
Language Technol. Lab., DFK, Saarbrucken, Germany
fYear :
2003
fDate :
13-17 Oct. 2003
Firstpage :
125
Lastpage :
131
Abstract :
We present a novel method for mining textual answers in German Web pages using semistructured NL questions and Google for initial document retrieval. We exploit the redundancy on the Web by weighting all identified named entities (NEs) found in the relevant document set based on their occurrences and distributions. The ranked NEs are used as our primary anchors for document indexing, paragraph selection, and answer identification. The latter is dependent on two factors: the overlap of terms at different levels (e.g., tokens and named entities) between queries and sentences, and the relevance of identified NEs corresponding to the expected answer type. The set of answer candidates is further subdivided into ranked equivalent classes from which the final answer is selected. The system has been evaluated using question-answer pairs extracted from a popular German quiz book.
Keywords :
Internet; data mining; equivalence classes; indexing; natural languages; query processing; relevance feedback; set theory; German Web pages; German quiz book; Google page; document indexing; document retrieval; equivalent classes; identified named entities relevance; paragraph selection; question-answer pairs; semistructured NL questions; textual answers mining; Australia; Books; Cities and towns; Data mining; Indexing; Information retrieval; Natural languages; Pattern matching; Performance evaluation; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence, 2003. WI 2003. Proceedings. IEEE/WIC International Conference on
Print_ISBN :
0-7695-1932-6
Type :
conf
DOI :
10.1109/WI.2003.1241183
Filename :
1241183
Link To Document :
بازگشت