• DocumentCode
    570176
  • Title

    Evaluating and enhancing cross-domain rank predictability of textual entailment datasets

  • Author

    Lee, Cheng-Wei ; Lin, Chuan-Jie ; Shima, Hideki ; Hsu, Wen-Lian

  • Author_Institution
    Inst. of Inf. Sci., Acad. Sinica, Taipei, Taiwan
  • fYear
    2012
  • fDate
    8-10 Aug. 2012
  • Firstpage
    51
  • Lastpage
    58
  • Abstract
    Textual Entailment (TE) is the task of recognizing entailment, paraphrase, and contradiction relations between a given text pair. The goal of textual entailment research is to develop a core inference component that can be applied to various domains, such as IR or NLP. Since the domain that a TE system applies to may be different from its source domain, it is crucial to develop proper datasets for measuring the cross-domain ability of a TE system. We propose using Kendall´s tau to measure a dataset´s cross-domain rank predictability. Our analysis shows that incorporating “artificial pairs” into a dataset helps enhance its rank predictability. We also find that the completeness of guidelines has no obvious effect on the rank predictability of a dataset. To validate these findings, more investigation is needed; however these findings suggest some new directions for the creation of TE datasets in the future.
  • Keywords
    text analysis; Kendalls tau; TE; core inference component; enhancing cross domain rank predictability; textual entailment datasets; Accuracy; Correlation; Educational institutions; Guidelines; Humans; Standards; Text recognition; Cross-Domain Evaluation; RITE; Rank Predictability; Textual Entailment;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse and Integration (IRI), 2012 IEEE 13th International Conference on
  • Conference_Location
    Las Vegas, NV
  • Print_ISBN
    978-1-4673-2282-9
  • Electronic_ISBN
    978-1-4673-2283-6
  • Type

    conf

  • DOI
    10.1109/IRI.2012.6302990
  • Filename
    6302990