PDD-SHAP: Fast Approximations for Shapley Values using Functional
Decomposition
- URL: http://arxiv.org/abs/2208.12595v1
- Date: Fri, 26 Aug 2022 11:49:54 GMT
- Title: PDD-SHAP: Fast Approximations for Shapley Values using Functional
Decomposition
- Authors: Arne Gevaert, Yvan Saeys
- Abstract summary: We propose PDD-SHAP, an algorithm that uses an ANOVA-based functional decomposition model to approximate the black-box model being explained.
This allows us to calculate Shapley values orders of magnitude faster than existing methods for large datasets, significantly reducing the amortized cost of computing Shapley values.
- Score: 2.0559497209595823
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Because of their strong theoretical properties, Shapley values have become
very popular as a way to explain predictions made by black box models.
Unfortuately, most existing techniques to compute Shapley values are
computationally very expensive. We propose PDD-SHAP, an algorithm that uses an
ANOVA-based functional decomposition model to approximate the black-box model
being explained. This allows us to calculate Shapley values orders of magnitude
faster than existing methods for large datasets, significantly reducing the
amortized cost of computing Shapley values when many predictions need to be
explained.
Related papers
- Improving the Sampling Strategy in KernelSHAP [0.8057006406834466]
KernelSHAP framework enables us to approximate the Shapley values using a sampled subset of weighted conditional expectations.
We propose three main novel contributions: a stabilizing technique to reduce the variance of the weights in the current state-of-the-art strategy, a novel weighing scheme that corrects the Shapley kernel weights based on sampled subsets, and a straightforward strategy that includes the important subsets and integrates them with the corrected Shapley kernel weights.
arXiv Detail & Related papers (2024-10-07T10:02:31Z) - Accelerated Shapley Value Approximation for Data Evaluation [3.707457963532597]
We show that Shapley value of data points can be approximated more efficiently by leveraging structural properties of machine learning problems.
Our analysis suggests that in fact models trained on small subsets are more important in context of data valuation.
arXiv Detail & Related papers (2023-11-09T13:15:36Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - Computing SHAP Efficiently Using Model Structure Information [3.6626323701161665]
We propose methods that compute SHAP exactly in time or even faster for SHAP definitions that satisfy our additivity and dummy assumptions.
For the first case, we demonstrate an additive property and a way to compute SHAP from the lower-order functional components.
For the second case, we derive formulas that can compute SHAP in time. Both methods yield exact SHAP results.
arXiv Detail & Related papers (2023-09-05T17:48:09Z) - Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - Shapley-NAS: Discovering Operation Contribution for Neural Architecture
Search [96.20505710087392]
We propose a Shapley value based method to evaluate operation contribution (Shapley-NAS) for neural architecture search.
We show that our method outperforms the state-of-the-art methods by a considerable margin with light search cost.
arXiv Detail & Related papers (2022-06-20T14:41:49Z) - FastSHAP: Real-Time Shapley Value Estimation [25.536804325758805]
FastSHAP is a method for estimating Shapley values in a single forward pass using a learned explainer model.
It amortizes the cost of explaining many inputs via a learning approach inspired by Shapley value's weighted least squares characterization.
It generates high-quality explanations with orders of magnitude speedup.
arXiv Detail & Related papers (2021-07-15T16:34:45Z) - Fast Hierarchical Games for Image Explanations [78.16853337149871]
We present a model-agnostic explanation method for image classification based on a hierarchical extension of Shapley coefficients.
Unlike other Shapley-based explanation methods, h-Shap is scalable and can be computed without the need of approximation.
We compare our hierarchical approach with popular Shapley-based and non-Shapley-based methods on a synthetic dataset, a medical imaging scenario, and a general computer vision problem.
arXiv Detail & Related papers (2021-04-13T13:11:02Z) - Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics.
Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis.
We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z) - Towards Efficient Data Valuation Based on the Shapley Value [65.4167993220998]
We study the problem of data valuation by utilizing the Shapley value.
The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value.
We propose a repertoire of efficient algorithms for approximating the Shapley value.
arXiv Detail & Related papers (2019-02-27T00:22:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.