Causal KL: Evaluating Causal Discovery
- URL: http://arxiv.org/abs/2111.06029v1
- Date: Thu, 11 Nov 2021 02:46:53 GMT
- Title: Causal KL: Evaluating Causal Discovery
- Authors: Rodney T. O'Donnell, Kevin B. Korb and Lloyd Allison
- Abstract summary: Two most commonly used criteria for assessing causal model discovery with artificial data are edit-distance and Kullback-Leibler divergence.
We argue that they are both insufficiently discriminating in judging the relative merits of false models.
We propose an augmented KL divergence, which takes into account causal relationships which distinguish between observationally equivalent models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The two most commonly used criteria for assessing causal model discovery with
artificial data are edit-distance and Kullback-Leibler divergence, measured
from the true model to the learned model. Both of these metrics maximally
reward the true model. However, we argue that they are both insufficiently
discriminating in judging the relative merits of false models. Edit distance,
for example, fails to distinguish between strong and weak probabilistic
dependencies. KL divergence, on the other hand, rewards equally all
statistically equivalent models, regardless of their different causal claims.
We propose an augmented KL divergence, which we call Causal KL (CKL), which
takes into account causal relationships which distinguish between
observationally equivalent models. Results are presented for three variants of
CKL, showing that Causal KL works well in practice.
Related papers
- Quantifying the Sensitivity of Inverse Reinforcement Learning to
Misspecification [72.08225446179783]
Inverse reinforcement learning aims to infer an agent's preferences from their behaviour.
To do this, we need a behavioural model of how $pi$ relates to $R$.
We analyse how sensitive the IRL problem is to misspecification of the behavioural model.
arXiv Detail & Related papers (2024-03-11T16:09:39Z) - Identifiable Latent Polynomial Causal Models Through the Lens of Change [82.14087963690561]
Causal representation learning aims to unveil latent high-level causal representations from observed low-level data.
One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability.
arXiv Detail & Related papers (2023-10-24T07:46:10Z) - Achieving Counterfactual Fairness with Imperfect Structural Causal Model [11.108866104714627]
We propose a novel minimax game-theoretic model for counterfactual fairness.
We also theoretically prove the error bound of the proposed minimax model.
Empirical experiments on multiple real-world datasets illustrate our superior performance in both accuracy and fairness.
arXiv Detail & Related papers (2023-03-26T09:37:29Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in
Knowledge Distillation [9.157410884444312]
Knowledge distillation (KD) has been investigated to design efficient neural architectures.
We show that the KL divergence loss focuses on the logit matching when tau increases and the label matching when tau goes to 0.
We show that sequential distillation can improve performance and that KD, particularly when using the KL divergence loss with small tau, mitigates the label noise.
arXiv Detail & Related papers (2021-05-19T04:40:53Z) - Why do classifier accuracies show linear trends under distribution
shift? [58.40438263312526]
accuracies of models on one data distribution are approximately linear functions of the accuracies on another distribution.
We assume the probability that two models agree in their predictions is higher than what we can infer from their accuracy levels alone.
We show that a linear trend must occur when evaluating models on two distributions unless the size of the distribution shift is large.
arXiv Detail & Related papers (2020-12-31T07:24:30Z) - Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs.
We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables.
We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z) - A Ladder of Causal Distances [44.34185575573054]
We introduce a hierarchy of three distances, one for each rung of the "ladder of causation"
We put our causal distances to use by benchmarking standard causal discovery systems on both synthetic and real-world datasets.
Finally, we highlight the usefulness of our causal distances by briefly discussing further applications beyond the evaluation of causal discovery techniques.
arXiv Detail & Related papers (2020-05-05T20:39:07Z) - CausalVAE: Structured Causal Disentanglement in Variational Autoencoder [52.139696854386976]
The framework of variational autoencoder (VAE) is commonly used to disentangle independent factors from observations.
We propose a new VAE based framework named CausalVAE, which includes a Causal Layer to transform independent factors into causal endogenous ones.
Results show that the causal representations learned by CausalVAE are semantically interpretable, and their causal relationship as a Directed Acyclic Graph (DAG) is identified with good accuracy.
arXiv Detail & Related papers (2020-04-18T20:09:34Z) - Convex Fairness Constrained Model Using Causal Effect Estimators [6.414055487487486]
We devise novel models, called FairCEEs, which remove discrimination while keeping explanatory bias.
We provide an efficient algorithm for solving FairCEEs in regression and binary classification tasks.
arXiv Detail & Related papers (2020-02-16T03:40:04Z) - The role of (non)contextuality in Bell's theorems from the perspective
of an operational modeling framework [0.0]
It is shown that noncontextuality is the most general property of an operational model that blocks replication of QM predictions.
It is shown that the construction of convex hulls of finite ensembles of OD model instances is (mathematically) equivalent to the traditional hidden variables approach.
arXiv Detail & Related papers (2020-01-23T20:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.