Amortized SHAP values via sparse Fourier function approximation
- URL: http://arxiv.org/abs/2410.06300v1
- Date: Tue, 8 Oct 2024 19:05:50 GMT
- Title: Amortized SHAP values via sparse Fourier function approximation
- Authors: Ali Gorji, Andisheh Amrollahi, Andreas Krause,
- Abstract summary: SHAP values are a popular local feature-attribution method widely used in interpretable and explainable AI.
We propose a two-stage approach for estimating SHAP values.
Our algorithm's first step harnesses recent results showing that many real-world predictors have a spectral bias.
- Score: 38.818224762845624
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: SHAP values are a popular local feature-attribution method widely used in interpretable and explainable AI. We tackle the problem of efficiently computing these values. We cover both the model-agnostic (black-box) setting, where one only has query access to the model and also the case of (ensembles of) trees where one has access to the structure of the tree. For both the black-box and the tree setting we propose a two-stage approach for estimating SHAP values. Our algorithm's first step harnesses recent results showing that many real-world predictors have a spectral bias that allows us to either exactly represent (in the case of ensembles of decision trees), or efficiently approximate them (in the case of neural networks) using a compact Fourier representation. In the second step of the algorithm, we use the Fourier representation to exactly compute SHAP values. The second step is computationally very cheap because firstly, the representation is compact and secondly, we prove that there exists a closed-form expression for SHAP values for the Fourier basis functions. Furthermore, the expression we derive effectively linearizes the computation into a simple summation and is amenable to parallelization on multiple cores or a GPU. Since the function approximation (first step) is only done once, it allows us to produce Shapley values in an amortized way. We show speedups compared to relevant baseline methods equal levels of accuracy for both the tree and black-box settings. Moreover, this approach introduces a reliable and fine-grained continuous trade-off between computation and accuracy through the sparsity of the Fourier approximation, a feature previously unavailable in all black-box methods.
Related papers
- Iterative Methods for Full-Scale Gaussian Process Approximations for Large Spatial Data [9.913418444556486]
We show how iterative methods can be used to reduce the computational costs for calculating likelihoods, gradients, and predictive distributions with FSAs.
We also present a novel, accurate, and fast way to calculate predictive variances relying on estimations and iterative methods.
All methods are implemented in a free C++ software library with high-level Python and R packages.
arXiv Detail & Related papers (2024-05-23T12:25:22Z) - Self-concordant Smoothing for Large-Scale Convex Composite Optimization [0.0]
We introduce a notion of self-concordant smoothing for minimizing the sum of two convex functions, one of which is smooth and the other may be nonsmooth.
We prove the convergence of two resulting algorithms: Prox-N-SCORE, a proximal Newton algorithm and Prox-GGN-SCORE, a proximal generalized Gauss-Newton algorithm.
arXiv Detail & Related papers (2023-09-04T19:47:04Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Provable benefits of score matching [30.317535687908755]
We give the first example of a natural exponential family of distributions such that score matching loss is computationally efficient to optimize.
We show that designing a zeroth-order or first-order oracle for optimizing the likelihood loss is NP-hard.
Minimizing the score matching loss is both computationally and statistically efficient, with complexity in the ambient dimension.
arXiv Detail & Related papers (2023-06-03T03:42:30Z) - IBIA: An Incremental Build-Infer-Approximate Framework for Approximate
Inference of Partition Function [0.0]
Exact computation of the partition function is known to be intractable.
We propose a novel incremental build-infer-approximate framework for approximate inference.
We show that the framework can be used to efficiently compute the partition function.
arXiv Detail & Related papers (2023-04-13T09:40:23Z) - Efficient Graph Field Integrators Meet Point Clouds [59.27295475120132]
We present two new classes of algorithms for efficient field integration on graphs encoding point clouds.
The first class, SeparatorFactorization(SF), leverages the bounded genus of point cloud mesh graphs, while the second class, RFDiffusion(RFD), uses popular epsilon-nearest-neighbor graph representations for point clouds.
arXiv Detail & Related papers (2023-02-02T08:33:36Z) - A Stochastic Bundle Method for Interpolating Networks [18.313879914379008]
We propose a novel method for training deep neural networks that are capable of driving the empirical loss to zero.
At each iteration our method constructs a maximum linear approximation, known as the bundle of the objective learning approximation.
arXiv Detail & Related papers (2022-01-29T23:02:30Z) - Efficient semidefinite-programming-based inference for binary and
multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Lagrangian Decomposition for Neural Network Verification [148.0448557991349]
A fundamental component of neural network verification is the computation of bounds on the values their outputs can take.
We propose a novel approach based on Lagrangian Decomposition.
We show that we obtain bounds comparable with off-the-shelf solvers in a fraction of their running time.
arXiv Detail & Related papers (2020-02-24T17:55:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.