Related papers: Algorithms to estimate Shapley value feature attributions

Algorithms to estimate Shapley value feature attributions

URL: http://arxiv.org/abs/2207.07605v1
Date: Fri, 15 Jul 2022 17:04:41 GMT
Title: Algorithms to estimate Shapley value feature attributions
Authors: Hugh Chen and Ian C. Covert and Scott M. Lundberg and Su-In Lee
Abstract summary: Feature attributions based on the Shapley value are popular for explaining machine learning models. We disentangle this complexity into two factors: (1)the approach to removing feature information, and (2)the tractable estimation strategy.
Score: 11.527421282223948
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Feature attributions based on the Shapley value are popular for explaining machine learning models; however, their estimation is complex from both a theoretical and computational standpoint. We disentangle this complexity into two factors: (1)~the approach to removing feature information, and (2)~the tractable estimation strategy. These two factors provide a natural lens through which we can better understand and compare 24 distinct algorithms. Based on the various feature removal approaches, we describe the multiple types of Shapley value feature attributions and methods to calculate each one. Then, based on the tractable estimation strategies, we characterize two distinct families of approaches: model-agnostic and model-specific approximations. For the model-agnostic approximations, we benchmark a wide class of estimation approaches and tie them to alternative yet equivalent characterizations of the Shapley value. For the model-specific approximations, we clarify the assumptions crucial to each method's tractability for linear, tree, and deep models. Finally, we identify gaps in the literature and promising future research directions.

Related papers

Analysis of Off-Policy $n$-Step TD-Learning with Linear Function Approximation [6.663174194579773]
This paper analyzes multi-step temporal difference (TD)-learning algorithms within the deadly triad'' scenario. In particular, we prove that $n$-step TD-learning algorithms converge to a solution as the sampling horizon $n$ increases sufficiently. Two $n$-step TD-learning algorithms are proposed and analyzed, which can be seen as the model-free reinforcement learning counterparts of the model-based deterministic algorithms.
arXiv Detail & Related papers (2025-02-13T03:43:13Z)
Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC) LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses. LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z)
Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable. CMDPs serve as an important framework to model many real-world applications with time-varying environments. We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z)
Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques. In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets. Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z)
Efficient Shapley Values Estimation by Amortization for Text Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations. Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z)
SHAP-IQ: Unified Approximation of any-order Shapley Interactions [6.101024067998782]
Shapley value (SV) is applied to determine feature attributions for any black box model. ShaPley Interaction Quantification (SHAP-IQ) is an efficient sampling-based approximator to compute Shapley interactions.
arXiv Detail & Related papers (2023-03-02T11:49:05Z)
Multivariate Systemic Risk Measures and Computation by Deep Learning Algorithms [63.03966552670014]
We discuss the key related theoretical aspects, with a particular focus on the fairness properties of primal optima and associated risk allocations. The algorithms we provide allow for learning primals, optima for the dual representation and corresponding fair risk allocations.
arXiv Detail & Related papers (2023-02-02T22:16:49Z)
Exact Shapley Values for Local and Model-True Explanations of Decision Tree Ensembles [0.0]
We consider the application of Shapley values for explaining decision tree ensembles. We present a novel approach to Shapley value-based feature attribution that can be applied to random forests and boosted decision trees.
arXiv Detail & Related papers (2021-12-16T20:16:02Z)
Accurate Shapley Values for explaining tree-based models [0.0]
We introduce two estimators of Shapley Values that exploit the tree structure efficiently and are more accurate than state-of-the-art methods. These methods are available as a Python package.
arXiv Detail & Related papers (2021-06-07T17:35:54Z)
Explaining predictive models using Shapley values and non-parametric vine copulas [2.6774008509840996]
We propose two new approaches for modelling the dependence between the features. The performance of the proposed methods is evaluated on simulated data sets and a real data set. Experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than its competitors.
arXiv Detail & Related papers (2021-02-12T09:43:28Z)
A Multilinear Sampling Algorithm to Estimate Shapley Values [4.771833920251869]
We propose a new sampling method based on a multilinear extension technique as applied in game theory. Our method is applicable to any machine learning model, in particular for either multi-class classifications or regression problems.
arXiv Detail & Related papers (2020-10-22T21:47:16Z)
CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus [62.86856923633923]
We present a robust estimator for fitting multiple parametric models of the same form to noisy measurements. In contrast to previous works, which resorted to hand-crafted search strategies for multiple model detection, we learn the search strategy from data. For self-supervised learning of the search, we evaluate the proposed algorithm on multi-homography estimation and demonstrate an accuracy that is superior to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-08T17:37:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.