Exact Shapley Values for Local and Model-True Explanations of Decision
Tree Ensembles
- URL: http://arxiv.org/abs/2112.10592v1
- Date: Thu, 16 Dec 2021 20:16:02 GMT
- Title: Exact Shapley Values for Local and Model-True Explanations of Decision
Tree Ensembles
- Authors: Thomas W. Campbell, Heinrich Roder, Robert W. Georgantas III, Joanna
Roder
- Abstract summary: We consider the application of Shapley values for explaining decision tree ensembles.
We present a novel approach to Shapley value-based feature attribution that can be applied to random forests and boosted decision trees.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Additive feature explanations using Shapley values have become popular for
providing transparency into the relative importance of each feature to an
individual prediction of a machine learning model. While Shapley values provide
a unique additive feature attribution in cooperative game theory, the Shapley
values that can be generated for even a single machine learning model are far
from unique, with theoretical and implementational decisions affecting the
resulting attributions. Here, we consider the application of Shapley values for
explaining decision tree ensembles and present a novel approach to Shapley
value-based feature attribution that can be applied to random forests and
boosted decision trees. This new method provides attributions that accurately
reflect details of the model prediction algorithm for individual instances,
while being computationally competitive with one of the most widely used
current methods. We explain the theoretical differences between the standard
and novel approaches and compare their performance using synthetic and real
data.
Related papers
- Shapley Pruning for Neural Network Compression [63.60286036508473]
This work presents the Shapley value approximations, and performs the comparative analysis in terms of cost-benefit utility for the neural network compression.
The proposed normative ranking and its approximations show practical results, obtaining state-of-the-art network compression.
arXiv Detail & Related papers (2024-07-19T11:42:54Z) - Variational Shapley Network: A Probabilistic Approach to Self-Explaining
Shapley values with Uncertainty Quantification [2.6699011287124366]
Shapley values have emerged as a foundational tool in machine learning (ML) for elucidating model decision-making processes.
We introduce a novel, self-explaining method that simplifies the computation of Shapley values significantly, requiring only a single forward pass.
arXiv Detail & Related papers (2024-02-06T18:09:05Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - Efficient Shapley Values Estimation by Amortization for Text
Classification [66.7725354593271]
We develop an amortized model that directly predicts each input feature's Shapley Value without additional model evaluations.
Experimental results on two text classification datasets demonstrate that our amortized model estimates Shapley Values accurately with up to 60 times speedup.
arXiv Detail & Related papers (2023-05-31T16:19:13Z) - Grouping Shapley Value Feature Importances of Random Forests for
explainable Yield Prediction [0.8543936047647136]
We explain the concept of Shapley values directly computed for groups of features and introduce an algorithm to compute them efficiently on tree structures.
We provide a blueprint for designing swarm plots that combine many local explanations for global understanding.
arXiv Detail & Related papers (2023-04-14T13:03:33Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - groupShapley: Efficient prediction explanation with Shapley values for
feature groups [2.320417845168326]
Shapley values has established itself as one of the most appropriate and theoretically sound frameworks for explaining predictions from machine learning models.
The main drawback with Shapley values is that its computational complexity grows exponentially in the number of input features.
The present paper introduces groupShapley: a conceptually simple approach for dealing with the aforementioned bottlenecks.
arXiv Detail & Related papers (2021-06-23T08:16:14Z) - Explaining predictive models using Shapley values and non-parametric
vine copulas [2.6774008509840996]
We propose two new approaches for modelling the dependence between the features.
The performance of the proposed methods is evaluated on simulated data sets and a real data set.
Experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than its competitors.
arXiv Detail & Related papers (2021-02-12T09:43:28Z) - Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual
Predictions of Complex Models [6.423239719448169]
Shapley values are designed to attribute the difference between a model's prediction and an average baseline to the different features used as input to the model.
We show how these 'causal' Shapley values can be derived for general causal graphs without sacrificing any of their desirable properties.
arXiv Detail & Related papers (2020-11-03T11:11:36Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.