Related papers: Precision of Individual Shapley Value Explanations

Precision of Individual Shapley Value Explanations

URL: http://arxiv.org/abs/2312.03485v1
Date: Wed, 6 Dec 2023 13:29:23 GMT
Title: Precision of Individual Shapley Value Explanations
Authors: Lars Henry Berge Olsen
Abstract summary: Shapley values are extensively used in explainable artificial intelligence (XAI) as a framework to explain predictions made by complex machine learning (ML) models. We show that the explanations are systematically less precise for observations on the outer region of the training data distribution. This is expected from a statistical point of view, but to the best of our knowledge, it has not been systematically addressed in the Shapley value literature.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Shapley values are extensively used in explainable artificial intelligence (XAI) as a framework to explain predictions made by complex machine learning (ML) models. In this work, we focus on conditional Shapley values for predictive models fitted to tabular data and explain the prediction $f(\boldsymbol{x}^{*})$ for a single observation $\boldsymbol{x}^{*}$ at the time. Numerous Shapley value estimation methods have been proposed and empirically compared on an average basis in the XAI literature. However, less focus has been devoted to analyzing the precision of the Shapley value explanations on an individual basis. We extend our work in Olsen et al. (2023) by demonstrating and discussing that the explanations are systematically less precise for observations on the outer region of the training data distribution for all used estimation methods. This is expected from a statistical point of view, but to the best of our knowledge, it has not been systematically addressed in the Shapley value literature. This is crucial knowledge for Shapley values practitioners, who should be more careful in applying these observations' corresponding Shapley value explanations.

Related papers

FW-Shapley: Real-time Estimation of Weighted Shapley Values [21.562508939780532]
We present Fast Weighted Shapley, an amortized framework for efficiently computing weighted Shapley values. We also show that our estimator's training procedure is theoretically valid even though we do not use ground truth weighted Shapley values during training. For data valuation, we are much faster (14 times) while being comparable to the state-of-the-art KNN Shapley.
arXiv Detail & Related papers (2025-03-09T13:13:14Z)
DUPRE: Data Utility Prediction for Efficient Data Valuation [49.60564885180563]
Cooperative game theory-based data valuation, such as Data Shapley, requires evaluating the data utility and retraining the ML model for multiple data subsets. Our framework, textttDUPRE, takes an alternative yet complementary approach that reduces the cost per subset evaluation by predicting data utilities instead of evaluating them by model retraining. Specifically, given the evaluated data utilities of some data subsets, textttDUPRE fits a emphGaussian process (GP) regression model to predict the utility of every other data subset.
arXiv Detail & Related papers (2025-02-22T08:53:39Z)
On Model Extrapolation in Marginal Shapley Values [0.0]
One of the most popular methods for model explainability is based on Shapley values.<n> marginal approach to calculating Shapley values leads to model extrapolation where it might not be well defined.<n>We propose an approach which while using marginal averaging avoids model extrapolation and with addition of causal information replicates causal Shapley values.
arXiv Detail & Related papers (2024-12-17T18:33:14Z)
Improving the Sampling Strategy in KernelSHAP [0.8057006406834466]
KernelSHAP framework enables us to approximate the Shapley values using a sampled subset of weighted conditional expectations. We propose three main novel contributions: a stabilizing technique to reduce the variance of the weights in the current state-of-the-art strategy, a novel weighing scheme that corrects the Shapley kernel weights based on sampled subsets, and a straightforward strategy that includes the important subsets and integrates them with the corrected Shapley kernel weights.
arXiv Detail & Related papers (2024-10-07T10:02:31Z)
Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques. In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets. Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z)
Efficient Shapley Values Estimation by Amortization for Text Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations. Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z)
SHAP-XRT: The Shapley Value Meets Conditional Independence Testing [21.794110108580746]
We show that Shapley-based explanation methods and conditional independence testing are closely related. We introduce the SHAPley EXplanation Randomization Test (SHAP-XRT), a testing procedure inspired by the Conditional Randomization Test (CRT) for a specific notion of local (i.e., on a sample) conditional independence. We show that the Shapley value itself provides an upper bound to the expected $p$-value of a global (i.e., overall) null hypothesis.
arXiv Detail & Related papers (2022-07-14T16:28:54Z)
Accurate Shapley Values for explaining tree-based models [0.0]
We introduce two estimators of Shapley Values that exploit the tree structure efficiently and are more accurate than state-of-the-art methods. These methods are available as a Python package.
arXiv Detail & Related papers (2021-06-07T17:35:54Z)
Fast Hierarchical Games for Image Explanations [78.16853337149871]
We present a model-agnostic explanation method for image classification based on a hierarchical extension of Shapley coefficients. Unlike other Shapley-based explanation methods, h-Shap is scalable and can be computed without the need of approximation. We compare our hierarchical approach with popular Shapley-based and non-Shapley-based methods on a synthetic dataset, a medical imaging scenario, and a general computer vision problem.
arXiv Detail & Related papers (2021-04-13T13:11:02Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models [6.423239719448169]
Shapley values are designed to attribute the difference between a model's prediction and an average baseline to the different features used as input to the model. We show how these 'causal' Shapley values can be derived for general causal graphs without sacrificing any of their desirable properties.
arXiv Detail & Related papers (2020-11-03T11:11:36Z)
The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets [61.66584140190247]
We show that feature-based explanations pose problems even for explaining trivial models. We show that two popular classes of explainers, Shapley explainers and minimal sufficient subsets explainers, target fundamentally different types of ground-truth explanations.
arXiv Detail & Related papers (2020-09-23T09:45:23Z)
Predictive and Causal Implications of using Shapley Value for Model Interpretation [6.744385328015561]
We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling. Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance. More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.
arXiv Detail & Related papers (2020-08-12T01:08:08Z)
Towards Efficient Data Valuation Based on the Shapley Value [65.4167993220998]
We study the problem of data valuation by utilizing the Shapley value. The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value. We propose a repertoire of efficient algorithms for approximating the Shapley value.
arXiv Detail & Related papers (2019-02-27T00:22:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.