Related papers: Energy-based Model for Accurate Shapley Value Estimation in Interpretable Deep Learning Predictive Modeling

Energy-based Model for Accurate Shapley Value Estimation in Interpretable Deep Learning Predictive Modeling

URL: http://arxiv.org/abs/2404.01078v2
Date: Sun, 5 May 2024 05:28:56 GMT
Title: Energy-based Model for Accurate Shapley Value Estimation in Interpretable Deep Learning Predictive Modeling
Authors: Cheng Lu, Jiusun Zeng, Yu Xia, Jinhui Cai, Shihua Luo,
Abstract summary: EmSHAP is an energy-based model for Shapley value estimation. It estimates the expectation of Shapley contribution function under arbitrary subset of features.
Score: 7.378438977893025
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As a favorable tool for explainable artificial intelligence (XAI), Shapley value has been widely used to interpret deep learning based predictive models. However, accurate and efficient estimation of Shapley value is difficult since the computation load grows exponentially with the increase of input features. Most existing accelerated estimation methods have to compromise on estimation accuracy with efficiency. In this article, we present EmSHAP(Energy-based model for Shapley value estimation) to estimate the expectation of Shapley contribution function under arbitrary subset of features given the rest. The energy-based model estimates the conditional density in the Shapley contribution function, which involves an energy network for approximating the unnormalized conditional density and a GRU (Gated Recurrent Unit) network for approximating the partition function. The GRU network maps the input features onto a hidden space to eliminate the impact of input orderings. In order to theoretically evaluate the performance of different Shapley value estimation methods, Theorems 1, 2 and 3 analyzed the error bounds of EmSHAP as well as two state-of-the-art methods, namely KernelSHAP and VAEAC. It is proved that EmSHAP has tighter error bound than KernelSHAP and VAEAC. Finally, case studies on two application examples show the enhanced estimation accuracy of EmSHAP.

Related papers

Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [50.52694757593443]
Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations.<n>We first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability.<n>We introduce a new SAE training algorithm based on bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity.
arXiv Detail & Related papers (2025-06-16T20:58:05Z)
Unveil Sources of Uncertainty: Feature Contribution to Conformal Prediction Intervals [0.3495246564946556]
We propose a novel, model-agnostic uncertainty attribution (UA) method grounded in conformal prediction (CP)<n>We define cooperative games where CP interval properties-such as width and bounds-serve as value functions, we attribute predictive uncertainty to input features.<n>Our experiments on synthetic benchmarks and real-world datasets demonstrate the practical utility and interpretative depth of our approach.
arXiv Detail & Related papers (2025-05-19T13:49:05Z)
Improving the Sampling Strategy in KernelSHAP [0.8057006406834466]
KernelSHAP framework enables us to approximate the Shapley values using a sampled subset of weighted conditional expectations. We propose three main novel contributions: a stabilizing technique to reduce the variance of the weights in the current state-of-the-art strategy, a novel weighing scheme that corrects the Shapley kernel weights based on sampled subsets, and a straightforward strategy that includes the important subsets and integrates them with the corrected Shapley kernel weights.
arXiv Detail & Related papers (2024-10-07T10:02:31Z)
Enabling Uncertainty Estimation in Iterative Neural Networks [49.56171792062104]
We develop an approach to uncertainty estimation that provides state-of-the-art estimates at a much lower computational cost than techniques like Ensembles. We demonstrate its practical value by embedding it in two application domains: road detection in aerial images and the estimation of aerodynamic properties of 2D and 3D shapes.
arXiv Detail & Related papers (2024-03-25T13:06:31Z)
Variational Shapley Network: A Probabilistic Approach to Self-Explaining Shapley values with Uncertainty Quantification [2.6699011287124366]
Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes. We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass.
arXiv Detail & Related papers (2024-02-06T18:09:05Z)
Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques. In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets. Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z)
Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation. Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions. We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z)
Efficient Shapley Values Estimation by Amortization for Text Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations. Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z)
A $k$-additive Choquet integral-based approach to approximate the SHAP values for local interpretability in machine learning [8.637110868126546]
This paper aims at providing some interpretability for machine learning models based on Shapley values. A SHAP-based method called Kernel SHAP adopts an efficient strategy that approximates such values with less computational effort. The obtained results attest that our proposal needs less computations on coalitions of attributes to approximate the SHAP values.
arXiv Detail & Related papers (2022-11-03T22:34:50Z)
Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees [57.67528738886731]
We study the numerical stability of scalable sparse approximations based on inducing points. For low-dimensional tasks such as geospatial modeling, we propose an automated method for computing inducing points satisfying these conditions.
arXiv Detail & Related papers (2022-10-14T15:20:17Z)
Adaptive LASSO estimation for functional hidden dynamic geostatistical model [69.10717733870575]
We propose a novel model selection algorithm based on a penalized maximum likelihood estimator (PMLE) for functional hiddenstatistical models (f-HD) The algorithm is based on iterative optimisation and uses an adaptive least absolute shrinkage and selector operator (GMSOLAS) penalty function, wherein the weights are obtained by the unpenalised f-HD maximum-likelihood estimators.
arXiv Detail & Related papers (2022-08-10T19:17:45Z)
Shapley Computations Using Surrogate Model-Based Trees [4.2575268077562685]
This paper proposes the use of a surrogate model-based tree to compute Shapley and SHAP values based on conditional expectation. Simulation studies show that the proposed algorithm provides improvements in accuracy, unifies global Shapley and SHAP interpretation, and the thresholding method provides a way to trade-off running time and accuracy.
arXiv Detail & Related papers (2022-07-11T22:20:51Z)
Accelerating Shapley Explanation via Contributive Cooperator Selection [42.11059072201565]
We propose a novel method SHEAR to significantly accelerate the Shapley explanation for DNN models. The selection of the feature coalitions follows our proposed Shapley chain rule to minimize the absolute error from the ground-truth Shapley values. SHEAR consistently outperforms state-of-the-art baseline methods across different evaluation metrics.
arXiv Detail & Related papers (2022-06-17T03:24:45Z)
Counterfactual Shapley Additive Explanations [6.916452769334367]
We propose a variant of SHAP, CoSHAP, that uses counterfactual generation techniques to produce a background dataset. We motivate the need within the actionable recourse setting for careful consideration of background datasets when using Shapley values for feature attributions.
arXiv Detail & Related papers (2021-10-27T08:44:53Z)
Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption. They can suffer from ill-posedness and convergence instability. This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z)
Probabilistic electric load forecasting through Bayesian Mixture Density Networks [70.50488907591463]
Probabilistic load forecasting (PLF) is a key component in the extended tool-chain required for efficient management of smart energy grids. We propose a novel PLF approach, framed on Bayesian Mixture Density Networks. To achieve reliable and computationally scalable estimators of the posterior distributions, both Mean Field variational inference and deep ensembles are integrated.
arXiv Detail & Related papers (2020-12-23T16:21:34Z)
Exploiting Submodular Value Functions For Scaling Up Active Perception [60.81276437097671]
In active perception tasks, agent aims to select sensory actions that reduce uncertainty about one or more hidden variables. Partially observable Markov decision processes (POMDPs) provide a natural model for such problems. As the number of sensors available to the agent grows, the computational cost of POMDP planning grows exponentially.
arXiv Detail & Related papers (2020-09-21T09:11:36Z)
Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics. Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis. We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.