• DocumentCode
    3163886
  • Title

    Controlling Attribute Effect in Linear Regression

  • Author

    Calders, Toon ; Karim, Asad ; Kamiran, Faisal ; Ali, Wesam ; Xiangliang Zhang

  • Author_Institution
    Comput. & Decision Eng. Dept., Univ. Libre de Bruxelles (ULB), Brussels, Belgium
  • fYear
    2013
  • fDate
    7-10 Dec. 2013
  • Firstpage
    71
  • Lastpage
    80
  • Abstract
    In data mining we often have to learn from biased data, because, for instance, data comes from different batches or there was a gender or racial bias in the collection of social data. In some applications it may be necessary to explicitly control this bias in the models we learn from the data. This paper is the first to study learning linear regression models under constraints that control the biasing effect of a given attribute such as gender or batch number. We show how propensity modeling can be used for factoring out the part of the bias that can be justified by externally provided explanatory attributes. Then we analytically derive linear models that minimize squared error while controlling the bias by imposing constraints on the mean outcome or residuals of the models. Experiments with discrimination-aware crime prediction and batch effect normalization tasks show that the proposed techniques are successful in controlling attribute effects in linear regression models.
  • Keywords
    data mining; learning (artificial intelligence); regression analysis; attribute biasing effect; attribute effect control; batch effect normalization tasks; batch number; data mining; discrimination-aware crime prediction; gender; learning linear regression models; propensity modeling; racial bias; social data collection; squared error minimization; Analytical models; Biological system modeling; Data mining; Data models; Linear regression; Predictive models; Vectors; Batch Effects; Fair Data Mining; Linear Regression; Propensity Score;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2013 IEEE 13th International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2013.114
  • Filename
    6729491