Fast Approximation of the Shapley Values Based on Order-of-Addition
Experimental Designs
- URL: http://arxiv.org/abs/2309.08923v1
- Date: Sat, 16 Sep 2023 08:28:15 GMT
- Title: Fast Approximation of the Shapley Values Based on Order-of-Addition
Experimental Designs
- Authors: Liuqing Yang, Yongdao Zhou, Haoda Fu, Min-Qian Liu, Wei Zheng
- Abstract summary: In a $d$-player coalition game, calculating a Shapley value requires the evaluation of $d!$ or $2d$ marginal contribution values.
A common remedy is to take a random sample of the permutations to surrogate for the complete list of permutations.
We find an advanced sampling scheme can be designed to yield much more accurate estimation of the Shapley value.
- Score: 11.684428415506968
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Shapley value is originally a concept in econometrics to fairly distribute
both gains and costs to players in a coalition game. In the recent decades, its
application has been extended to other areas such as marketing, engineering and
machine learning. For example, it produces reasonable solutions for problems in
sensitivity analysis, local model explanation towards the interpretable machine
learning, node importance in social network, attribution models, etc. However,
its heavy computational burden has been long recognized but rarely
investigated. Specifically, in a $d$-player coalition game, calculating a
Shapley value requires the evaluation of $d!$ or $2^d$ marginal contribution
values, depending on whether we are taking the permutation or combination
formulation of the Shapley value. Hence it becomes infeasible to calculate the
Shapley value when $d$ is reasonably large. A common remedy is to take a random
sample of the permutations to surrogate for the complete list of permutations.
We find an advanced sampling scheme can be designed to yield much more accurate
estimation of the Shapley value than the simple random sampling (SRS). Our
sampling scheme is based on combinatorial structures in the field of design of
experiments (DOE), particularly the order-of-addition experimental designs for
the study of how the orderings of components would affect the output. We show
that the obtained estimates are unbiased, and can sometimes deterministically
recover the original Shapley value. Both theoretical and simulations results
show that our DOE-based sampling scheme outperforms SRS in terms of estimation
accuracy. Surprisingly, it is also slightly faster than SRS. Lastly, real data
analysis is conducted for the C. elegans nervous system and the 9/11 terrorist
network.
Related papers
- Exactly Computing do-Shapley Values [16.29328788692574]
We show how to compute do-Shapley values in time linear in the number of irreducible sets $r$.<n>We also prove that non-parametric identifiability of do-Shapley values requires only the identification of interventional effects for the $d$ singleton coalitions.
arXiv Detail & Related papers (2026-02-06T21:24:56Z) - ZIP-RC: Optimizing Test-Time Compute via Zero-Overhead Joint Reward-Cost Prediction [57.799425838564]
We present ZIP-RC, an adaptive inference method that equips models with zero-overhead inference-time predictions of reward and cost.<n> ZIP-RC improves accuracy by up to 12% over majority voting at equal or lower average cost.
arXiv Detail & Related papers (2025-12-01T09:44:31Z) - Antithetic Sampling for Top-k Shapley Identification [19.221081896134567]
The Shapley value's popularity in and outside of explainable AI stems from its axiomatic uniqueness.
Most works investigate the uniform approximation of all features' Shapley values, needlessly consuming samples for insignificant features.
In contrast, identifying the $k$ most important features can already be sufficiently insightful and yields the potential to leverage algorithmic opportunities connected to the field of multi-armed bandits.
arXiv Detail & Related papers (2025-04-02T15:38:32Z) - FW-Shapley: Real-time Estimation of Weighted Shapley Values [21.562508939780532]
We present Fast Weighted Shapley, an amortized framework for efficiently computing weighted Shapley values.
We also show that our estimator's training procedure is theoretically valid even though we do not use ground truth weighted Shapley values during training.
For data valuation, we are much faster (14 times) while being comparable to the state-of-the-art KNN Shapley.
arXiv Detail & Related papers (2025-03-09T13:13:14Z) - Shapley Value Approximation Based on k-Additive Games [21.023521613326654]
The Shapley value is the prevalent solution for fair division problems in which a payout is to be divided among multiple agents.
Despite its popularity and axiomatic justification, the Shapley value suffers from a computational complexity that scales exponentially with the number of entities involved.
We propose SVA$k_textADD$, a novel approximation method that fits a $k$-additive surrogate game.
arXiv Detail & Related papers (2025-02-07T08:52:57Z) - Parallel simulation for sampling under isoperimetry and score-based diffusion models [56.39904484784127]
As data size grows, reducing the iteration cost becomes an important goal.
Inspired by the success of the parallel simulation of the initial value problem in scientific computation, we propose parallel Picard methods for sampling tasks.
Our work highlights the potential advantages of simulation methods in scientific computation for dynamics-based sampling and diffusion models.
arXiv Detail & Related papers (2024-12-10T11:50:46Z) - Improving the Sampling Strategy in KernelSHAP [0.8057006406834466]
KernelSHAP framework enables us to approximate the Shapley values using a sampled subset of weighted conditional expectations.
We propose three main novel contributions: a stabilizing technique to reduce the variance of the weights in the current state-of-the-art strategy, a novel weighing scheme that corrects the Shapley kernel weights based on sampled subsets, and a straightforward strategy that includes the important subsets and integrates them with the corrected Shapley kernel weights.
arXiv Detail & Related papers (2024-10-07T10:02:31Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - An Efficient Shapley Value Computation for the Naive Bayes Classifier [0.0]
This article proposes an exact analytic expression of Shapley values in the case of the naive Bayes classifier.
Results show that our Shapley proposal for the naive Bayes provides informative results with low algorithmic complexity.
arXiv Detail & Related papers (2023-07-31T14:39:10Z) - DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation [23.646508094051768]
We consider the dataset valuation problem, that is, the problem of quantifying the incremental gain.
The Shapley value is a natural tool to perform dataset valuation due to its formal axiomatic justification.
We propose a novel approximation, referred to as discrete uniform Shapley, which is expressed as an expectation under a discrete uniform distribution.
arXiv Detail & Related papers (2023-06-03T10:22:50Z) - Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - SHAP-XRT: The Shapley Value Meets Conditional Independence Testing [21.794110108580746]
We show that Shapley-based explanation methods and conditional independence testing are closely related.
We introduce the SHAPley EXplanation Randomization Test (SHAP-XRT), a testing procedure inspired by the Conditional Randomization Test (CRT) for a specific notion of local (i.e., on a sample) conditional independence.
We show that the Shapley value itself provides an upper bound to the expected $p$-value of a global (i.e., overall) null hypothesis.
arXiv Detail & Related papers (2022-07-14T16:28:54Z) - Shapley-NAS: Discovering Operation Contribution for Neural Architecture
Search [96.20505710087392]
We propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search.
We show that our method outperforms the state-of-the-art methods by a considerable margin with light search cost.
arXiv Detail & Related papers (2022-06-20T14:41:49Z) - Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free
Reinforcement Learning [52.76230802067506]
A novel model-free algorithm is proposed to minimize regret in episodic reinforcement learning.
The proposed algorithm employs an em early-settled reference update rule, with the aid of two Q-learning sequences.
The design principle of our early-settled variance reduction method might be of independent interest to other RL settings.
arXiv Detail & Related papers (2021-10-09T21:13:48Z) - WOR and $p$'s: Sketches for $\ell_p$-Sampling Without Replacement [75.12782480740822]
We design novel composable sketches for WOR $ell_p$ sampling.
Our sketches have size that grows only linearly with the sample size.
Our method is the first to provide WOR sampling in the important regime of $p>1$ and the first to handle signed updates for $p>0$.
arXiv Detail & Related papers (2020-07-14T00:19:27Z) - Towards Efficient Data Valuation Based on the Shapley Value [65.4167993220998]
We study the problem of data valuation by utilizing the Shapley value.
The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value.
We propose a repertoire of efficient algorithms for approximating the Shapley value.
arXiv Detail & Related papers (2019-02-27T00:22:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.