• DocumentCode
    3437124
  • Title

    On Estimation of Functional Causal Models: Post-Nonlinear Causal Model as an Example

  • Author

    Kun Zhang ; Zhikun Wang ; Scholkopf, Bernhard

  • Author_Institution
    Dept. Empirical Inference, Max-Planck Inst. for Intell. Syst., Tubingen, Germany
  • fYear
    2013
  • fDate
    7-10 Dec. 2013
  • Firstpage
    139
  • Lastpage
    146
  • Abstract
    Compared to constraint-based causal discovery, causal discovery based on functional causal models is able to identify the whole causal model under appropriate assumptions. Functional causal models represent the effect as a function of the direct causes together with an independent noise term. Examples include the linear non-Gaussian a cyclic model (LiNGAM), nonlinear additive noise model, and post-nonlinear (PNL) model. Currently there are two ways to estimate the parameters in the models, one is by dependence minimization, and the other is maximum likelihood. In this paper, we show that for any a cyclic functional causal model, minimizing the mutual information between the hypothetical cause and the noise term is equivalent to maximizing the data likelihood with a flexible model for the distribution of the noise term. We then focus on estimation of the PNL causal model, and propose to estimate it with the warped Gaussian process with the noise modeled by the mixture of Gaussians. As a Bayesian nonparametric approach, it outperforms the previous one based on mutual information minimization with nonlinear functions represented by multilayer perceptrons, we also show that unlike the ordinary regression, estimation results of the PNL causal model are sensitive to the assumption on the noise distribution. Experimental results on both synthetic and real data support our theoretical claims.
  • Keywords
    Bayes methods; Gaussian processes; causality; data mining; maximum likelihood estimation; nonparametric statistics; Bayesian nonparametric approach; LiNGAM; PNL causal model; PNL model; constraint-based causal discovery; cyclic functional causal model; data likelihood; dependence minimization; independent noise term; linear nonGaussian a cyclic model; maximum likelihood; multilayer perceptron; mutual information minimization; noise distribution; nonlinear additive noise model; nonlinear functions; ordinary regression; parameter estimation; post-nonlinear causal model; post-nonlinear model; real data support; synthetic data support; warped Gaussian process; Additive noise; Data models; Gaussian processes; Minimization; Mutual information; Nonlinear distortion; Causal discovery; Functional causal model; Post-nonlinear causal model; Warped Gaussian processes; maximum likelihood; mutual information minimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshops (ICDMW), 2013 IEEE 13th International Conference on
  • Conference_Location
    Dallas, TX
  • Print_ISBN
    978-1-4799-3143-9
  • Type

    conf

  • DOI
    10.1109/ICDMW.2013.162
  • Filename
    6753913