• DocumentCode
    3724098
  • Title

    A Bayesian Hierarchical Model for Comparing Average F1 Scores

  • Author

    Dell Zhang;Jun Wang;Xiaoxue Zhao;Xiaoling Wang

  • Author_Institution
    ISSIS, Birkbeck, Univ. of London, London, UK
  • fYear
    2015
  • Firstpage
    589
  • Lastpage
    598
  • Abstract
    In multi-class text classification, the performance (effectiveness) of a classifier is usually measured by micro-averaged and macro-averaged F1 scores. However, the scores themselves do not tell us how reliable they are in terms of forecasting the classifier´s future performance on unseen data. In this paper, we propose a novel approach to explicitly modelling the uncertainty of average F1 scores through Bayesian reasoning, and demonstrate that it can provide much more comprehensive performance comparison between text classifiers than the traditional frequentist null hypothesis significance testing (NHST).
  • Keywords
    "Bayes methods","Estimation","Computational modeling","Data models","Uncertainty","Electronic mail","Testing"
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2015 IEEE International Conference on
  • ISSN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2015.44
  • Filename
    7373363