Stabilizing Estimates of Shapley Values with Control Variates
- URL: http://arxiv.org/abs/2310.07672v3
- Date: Wed, 10 Apr 2024 00:35:36 GMT
- Title: Stabilizing Estimates of Shapley Values with Control Variates
- Authors: Jeremy Goldwasser, Giles Hooker,
- Abstract summary: Shapley values are among the most popular tools for explaining predictions of blackbox machine learning models.
Our methodology is applicable to any machine learning model and requires virtually no extra computation or modeling effort.
On several high-dimensional datasets, we find it can produce dramatic reductions in the Monte Carlo variability of Shapley estimates.
- Score: 3.8642937395065124
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Shapley values are among the most popular tools for explaining predictions of blackbox machine learning models. However, their high computational cost motivates the use of sampling approximations, inducing a considerable degree of uncertainty. To stabilize these model explanations, we propose ControlSHAP, an approach based on the Monte Carlo technique of control variates. Our methodology is applicable to any machine learning model and requires virtually no extra computation or modeling effort. On several high-dimensional datasets, we find it can produce dramatic reductions in the Monte Carlo variability of Shapley estimates.
Related papers
- Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences [6.067007470552307]
We propose a methodology for finding sequences of machine learning models that are stable across retraining iterations.
We develop a mixed-integer optimization formulation that is guaranteed to recover optimal models.
Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power.
arXiv Detail & Related papers (2024-03-28T22:45:38Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - Shapley variable importance cloud for machine learning models [4.1359299555083595]
Recently developed Shapley variable importance cloud (ShapleyVIC) provides comprehensive and robust variable importance assessments.
benefits of ShapleyVIC inference have been demonstrated in real-life prediction tasks.
ShapleyVIC implementation for machine learning models to enable wider applications.
arXiv Detail & Related papers (2022-12-16T09:45:22Z) - PDD-SHAP: Fast Approximations for Shapley Values using Functional
Decomposition [2.0559497209595823]
We propose PDD-SHAP, an algorithm that uses an ANOVA-based functional decomposition model to approximate the black-box model being explained.
This allows us to calculate Shapley values orders of magnitude faster than existing methods for large datasets, significantly reducing the amortized cost of computing Shapley values.
arXiv Detail & Related papers (2022-08-26T11:49:54Z) - Low-variance estimation in the Plackett-Luce model via quasi-Monte Carlo
sampling [58.14878401145309]
We develop a novel approach to producing more sample-efficient estimators of expectations in the PL model.
We illustrate our findings both theoretically and empirically using real-world recommendation data from Amazon Music and the Yahoo learning-to-rank challenge.
arXiv Detail & Related papers (2022-05-12T11:15:47Z) - Using Shapley Values and Variational Autoencoders to Explain Predictive
Models with Dependent Mixed Features [2.064612766965483]
We use a variational autoencoder with arbitrary conditioning (VAEAC) to model all feature dependencies simultaneously.
We apply VAEAC to the Abalone data set from the UCI Machine Learning Repository.
arXiv Detail & Related papers (2021-11-26T14:05:45Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Scalable Control Variates for Monte Carlo Methods via Stochastic
Optimization [62.47170258504037]
This paper presents a framework that encompasses and generalizes existing approaches that use controls, kernels and neural networks.
Novel theoretical results are presented to provide insight into the variance reduction that can be achieved, and an empirical assessment, including applications to Bayesian inference, is provided in support.
arXiv Detail & Related papers (2020-06-12T22:03:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.