Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual
Predictions of Complex Models
- URL: http://arxiv.org/abs/2011.01625v1
- Date: Tue, 3 Nov 2020 11:11:36 GMT
- Title: Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual
Predictions of Complex Models
- Authors: Tom Heskes, Evi Sijben, Ioan Gabriel Bucur, Tom Claassen
- Abstract summary: Shapley values are designed to attribute the difference between a model's prediction and an average baseline to the different features used as input to the model.
We show how these 'causal' Shapley values can be derived for general causal graphs without sacrificing any of their desirable properties.
- Score: 6.423239719448169
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Shapley values underlie one of the most popular model-agnostic methods within
explainable artificial intelligence. These values are designed to attribute the
difference between a model's prediction and an average baseline to the
different features used as input to the model. Being based on solid
game-theoretic principles, Shapley values uniquely satisfy several desirable
properties, which is why they are increasingly used to explain the predictions
of possibly complex and highly non-linear machine learning models. Shapley
values are well calibrated to a user's intuition when features are independent,
but may lead to undesirable, counterintuitive explanations when the
independence assumption is violated.
In this paper, we propose a novel framework for computing Shapley values that
generalizes recent work that aims to circumvent the independence assumption. By
employing Pearl's do-calculus, we show how these 'causal' Shapley values can be
derived for general causal graphs without sacrificing any of their desirable
properties. Moreover, causal Shapley values enable us to separate the
contribution of direct and indirect effects. We provide a practical
implementation for computing causal Shapley values based on causal chain graphs
when only partial information is available and illustrate their utility on a
real-world example.
Related papers
- Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - Near-optimal Offline Reinforcement Learning with Linear Representation:
Leveraging Variance Information with Pessimism [65.46524775457928]
offline reinforcement learning seeks to utilize offline/historical data to optimize sequential decision-making strategies.
We study the statistical limits of offline reinforcement learning with linear model representations.
arXiv Detail & Related papers (2022-03-11T09:00:12Z) - Exact Shapley Values for Local and Model-True Explanations of Decision
Tree Ensembles [0.0]
We consider the application of Shapley values for explaining decision tree ensembles.
We present a novel approach to Shapley value-based feature attribution that can be applied to random forests and boosted decision trees.
arXiv Detail & Related papers (2021-12-16T20:16:02Z) - Is Shapley Explanation for a model unique? [0.0]
We explore the relationship between the distribution of a feature and its Shapley value.
Our assessment is that Shapley value for particular feature not only depends on its expected mean but on other moments as well such as variance.
It varies with model outcome (Probability/Log-odds/binary decision such as accept vs reject) and hence model application.
arXiv Detail & Related papers (2021-11-23T15:31:46Z) - Explaining predictive models using Shapley values and non-parametric
vine copulas [2.6774008509840996]
We propose two new approaches for modelling the dependence between the features.
The performance of the proposed methods is evaluated on simulated data sets and a real data set.
Experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than its competitors.
arXiv Detail & Related papers (2021-02-12T09:43:28Z) - The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal
Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models.
We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z) - Predictive and Causal Implications of using Shapley Value for Model
Interpretation [6.744385328015561]
We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling.
Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance.
More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.
arXiv Detail & Related papers (2020-08-12T01:08:08Z) - Explaining the data or explaining a model? Shapley values that uncover
non-linear dependencies [0.0]
We introduce and demonstrate the use of the energy distance correlations, affine-invariant distance correlation, and Hilbert-Shmidt independence criterion as Shapley value characteristic functions.
arXiv Detail & Related papers (2020-07-12T15:04:59Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Towards Efficient Data Valuation Based on the Shapley Value [65.4167993220998]
We study the problem of data valuation by utilizing the Shapley value.
The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value.
We propose a repertoire of efficient algorithms for approximating the Shapley value.
arXiv Detail & Related papers (2019-02-27T00:22:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.