• DocumentCode
    3673682
  • Title

    Machine Learning for Imbalanced Datasets of Recognizing Inference in Text with Linguistic Phenomena

  • Author

    Min-Yuh Day;Cheng-Chia Tsai

  • Author_Institution
    Dept. of Inf. Manage., Tamkang Univ., Taipei, Taiwan
  • fYear
    2015
  • Firstpage
    562
  • Lastpage
    568
  • Abstract
    Recognizing inference in text (RITE) plays an important role in the answer validation modules for a Question Answering (QA) system. The problem of class imbalance has received increased attention in the machine learning community. In recent years, several attempts have been made on the linguistic phenomena analysis, however, little is known about the effects of imbalanced datasets with linguistic phenomenon in recognizing inference in text. The objective of this paper is to provide an empirical study on learning imbalanced datasets of recognizing inference in text with linguistic phenomena for a better understanding of the effects of imbalanced datasets with linguistic phenomenon in recognizing inference in text. In this paper, we proposed an analysis of imbalanced datasets of recognizing inference in text with linguistic phenomena using NTCIR 11 RITE-VAL gold standard dataset and development dataset. The experimental results suggest that the distribution of imbalanced datasets of recognizing inference in text with linguistic phenomenon could be dramatically varied on the performance of a machine learning classifier.
  • Keywords
    "Pragmatics","Standards","Gold","Text recognition","Yttrium","Accuracy","Semantics"
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse and Integration (IRI), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/IRI.2015.99
  • Filename
    7301027