DocumentCode
570176
Title
Evaluating and enhancing cross-domain rank predictability of textual entailment datasets
Author
Lee, Cheng-Wei ; Lin, Chuan-Jie ; Shima, Hideki ; Hsu, Wen-Lian
Author_Institution
Inst. of Inf. Sci., Acad. Sinica, Taipei, Taiwan
fYear
2012
fDate
8-10 Aug. 2012
Firstpage
51
Lastpage
58
Abstract
Textual Entailment (TE) is the task of recognizing entailment, paraphrase, and contradiction relations between a given text pair. The goal of textual entailment research is to develop a core inference component that can be applied to various domains, such as IR or NLP. Since the domain that a TE system applies to may be different from its source domain, it is crucial to develop proper datasets for measuring the cross-domain ability of a TE system. We propose using Kendall´s tau to measure a dataset´s cross-domain rank predictability. Our analysis shows that incorporating “artificial pairs” into a dataset helps enhance its rank predictability. We also find that the completeness of guidelines has no obvious effect on the rank predictability of a dataset. To validate these findings, more investigation is needed; however these findings suggest some new directions for the creation of TE datasets in the future.
Keywords
text analysis; Kendalls tau; TE; core inference component; enhancing cross domain rank predictability; textual entailment datasets; Accuracy; Correlation; Educational institutions; Guidelines; Humans; Standards; Text recognition; Cross-Domain Evaluation; RITE; Rank Predictability; Textual Entailment;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Reuse and Integration (IRI), 2012 IEEE 13th International Conference on
Conference_Location
Las Vegas, NV
Print_ISBN
978-1-4673-2282-9
Electronic_ISBN
978-1-4673-2283-6
Type
conf
DOI
10.1109/IRI.2012.6302990
Filename
6302990
Link To Document