Algorithms to estimate Shapley value feature attributions
- URL: http://arxiv.org/abs/2207.07605v1
- Date: Fri, 15 Jul 2022 17:04:41 GMT
- Title: Algorithms to estimate Shapley value feature attributions
- Authors: Hugh Chen and Ian C. Covert and Scott M. Lundberg and Su-In Lee
- Abstract summary: Feature attributions based on the Shapley value are popular for explaining machine learning models.
We disentangle this complexity into two factors: (1)the approach to removing feature information, and (2)the tractable estimation strategy.
- Score: 11.527421282223948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feature attributions based on the Shapley value are popular for explaining
machine learning models; however, their estimation is complex from both a
theoretical and computational standpoint. We disentangle this complexity into
two factors: (1)~the approach to removing feature information, and (2)~the
tractable estimation strategy. These two factors provide a natural lens through
which we can better understand and compare 24 distinct algorithms. Based on the
various feature removal approaches, we describe the multiple types of Shapley
value feature attributions and methods to calculate each one. Then, based on
the tractable estimation strategies, we characterize two distinct families of
approaches: model-agnostic and model-specific approximations. For the
model-agnostic approximations, we benchmark a wide class of estimation
approaches and tie them to alternative yet equivalent characterizations of the
Shapley value. For the model-specific approximations, we clarify the
assumptions crucial to each method's tractability for linear, tree, and deep
models. Finally, we identify gaps in the literature and promising future
research directions.
Related papers
- Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - SHAP-IQ: Unified Approximation of any-order Shapley Interactions [6.101024067998782]
Shapley value (SV) is applied to determine feature attributions for any black box model.
ShaPley Interaction Quantification (SHAP-IQ) is an efficient sampling-based approximator to compute Shapley interactions.
arXiv Detail & Related papers (2023-03-02T11:49:05Z) - Multivariate Systemic Risk Measures and Computation by Deep Learning
Algorithms [63.03966552670014]
We discuss the key related theoretical aspects, with a particular focus on the fairness properties of primal optima and associated risk allocations.
The algorithms we provide allow for learning primals, optima for the dual representation and corresponding fair risk allocations.
arXiv Detail & Related papers (2023-02-02T22:16:49Z) - Exact Shapley Values for Local and Model-True Explanations of Decision
Tree Ensembles [0.0]
We consider the application of Shapley values for explaining decision tree ensembles.
We present a novel approach to Shapley value-based feature attribution that can be applied to random forests and boosted decision trees.
arXiv Detail & Related papers (2021-12-16T20:16:02Z) - Accurate Shapley Values for explaining tree-based models [0.0]
We introduce two estimators of Shapley Values that exploit the tree structure efficiently and are more accurate than state-of-the-art methods.
These methods are available as a Python package.
arXiv Detail & Related papers (2021-06-07T17:35:54Z) - Explaining predictive models using Shapley values and non-parametric
vine copulas [2.6774008509840996]
We propose two new approaches for modelling the dependence between the features.
The performance of the proposed methods is evaluated on simulated data sets and a real data set.
Experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than its competitors.
arXiv Detail & Related papers (2021-02-12T09:43:28Z) - A Multilinear Sampling Algorithm to Estimate Shapley Values [4.771833920251869]
We propose a new sampling method based on a multilinear extension technique as applied in game theory.
Our method is applicable to any machine learning model, in particular for either multi-class classifications or regression problems.
arXiv Detail & Related papers (2020-10-22T21:47:16Z) - CONSAC: Robust Multi-Model Fitting by Conditional Sample Consensus [62.86856923633923]
We present a robust estimator for fitting multiple parametric models of the same form to noisy measurements.
In contrast to previous works, which resorted to hand-crafted search strategies for multiple model detection, we learn the search strategy from data.
For self-supervised learning of the search, we evaluate the proposed algorithm on multi-homography estimation and demonstrate an accuracy that is superior to state-of-the-art methods.
arXiv Detail & Related papers (2020-01-08T17:37:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.