Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions
for Tree Ensembles
- URL: http://arxiv.org/abs/2401.12069v1
- Date: Mon, 22 Jan 2024 16:08:41 GMT
- Title: Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions
for Tree Ensembles
- Authors: Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke
H\"ullermeier
- Abstract summary: The Shapley value (SV) is a concept in explainable artificial intelligence (XAI) research for quantifying additive feature attributions of predictions.
We present TreeSHAP-IQ, an efficient method to compute any-order additive Shapley interactions for predictions tree-based models.
- Score: 6.664930499708017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While shallow decision trees may be interpretable, larger ensemble models
like gradient-boosted trees, which often set the state of the art in machine
learning problems involving tabular data, still remain black box models. As a
remedy, the Shapley value (SV) is a well-known concept in explainable
artificial intelligence (XAI) research for quantifying additive feature
attributions of predictions. The model-specific TreeSHAP methodology solves the
exponential complexity for retrieving exact SVs from tree-based models.
Expanding beyond individual feature attribution, Shapley interactions reveal
the impact of intricate feature interactions of any order. In this work, we
present TreeSHAP-IQ, an efficient method to compute any-order additive Shapley
interactions for predictions of tree-based models. TreeSHAP-IQ is supported by
a mathematical framework that exploits polynomial arithmetic to compute the
interaction scores in a single recursive traversal of the tree, akin to Linear
TreeSHAP. We apply TreeSHAP-IQ on state-of-the-art tree ensembles and explore
interactions on well-established benchmark datasets.
Related papers
- Terminating Differentiable Tree Experts [77.2443883991608]
We propose a neuro-symbolic Differentiable Tree Machine that learns tree operations using a combination of transformers and Representation Products.
We first remove a series of different transformer layers that are used in every step by introducing a mixture of experts.
We additionally propose a new termination algorithm to provide the model the power to choose how many steps to make automatically.
arXiv Detail & Related papers (2024-07-02T08:45:38Z) - An efficient solution to Hidden Markov Models on trees with coupled branches [0.0]
We extend the framework of Hidden Models (HMMs) on trees to address scenarios where the tree-like structure of the data includes coupled branches.
We develop a programming algorithm that efficiently solves the likelihood, decoding, and parameter learning problems for tree-based HMMs with coupled branches.
arXiv Detail & Related papers (2024-06-03T18:00:00Z) - Forecasting with Hyper-Trees [50.72190208487953]
Hyper-Trees are designed to learn the parameters of time series models.
By relating the parameters of a target time series model to features, Hyper-Trees also address the issue of parameter non-stationarity.
In this novel approach, the trees first generate informative representations from the input features, which a shallow network then maps to the target model parameters.
arXiv Detail & Related papers (2024-05-13T15:22:15Z) - On marginal feature attributions of tree-based models [0.11184789007828977]
Local feature attributions based on marginal expectations, e.g. marginal Shapley, Owen or Banzhaf values, may be employed.
We present two (statistically similar) decision trees that compute the exact same function for which the "path-dependent" TreeSHAP yields different rankings of features.
We exploit the symmetry to derive an explicit formula, with improved complexity and only in terms of the internal model parameters, for marginal Shapley (and Banzhaf and Owen) values of CatBoost models.
arXiv Detail & Related papers (2023-02-16T17:18:03Z) - Unboxing Tree Ensembles for interpretability: a hierarchical
visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior.
The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z) - SETAR-Tree: A Novel and Accurate Tree Algorithm for Global Time Series
Forecasting [7.206754802573034]
In this paper, we explore the close connections between TAR models and regression trees.
We introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves.
In our evaluation, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms.
arXiv Detail & Related papers (2022-11-16T04:30:42Z) - RLET: A Reinforcement Learning Based Approach for Explainable QA with
Entailment Trees [47.745218107037786]
We propose RLET, a Reinforcement Learning based Entailment Tree generation framework.
RLET iteratively performs single step reasoning with sentence selection and deduction generation modules.
Experiments on three settings of the EntailmentBank dataset demonstrate the strength of using RL framework.
arXiv Detail & Related papers (2022-10-31T06:45:05Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Evaluating Tree Explanation Methods for Anomaly Reasoning: A Case Study
of SHAP TreeExplainer and TreeInterpreter [6.718611456024702]
We investigate the performance of two methods for explaining tree-based models- Tree Interpreter (TI) and SHapley Additive exPlanations TreeExplainer (SHAP-TE)
We find that, although the SHAP-TE offers consistency guarantees over TI, at the cost of increased computation, consistency does not necessarily improve the explanation performance in our case study.
arXiv Detail & Related papers (2020-10-13T23:18:26Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.