Unboxing Tree Ensembles for interpretability: a hierarchical
visualization tool and a multivariate optimal re-built tree
- URL: http://arxiv.org/abs/2302.07580v2
- Date: Thu, 18 Jan 2024 18:42:28 GMT
- Title: Unboxing Tree Ensembles for interpretability: a hierarchical
visualization tool and a multivariate optimal re-built tree
- Authors: Giulia Di Teodoro, Marta Monaci, Laura Palagi
- Abstract summary: We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior.
The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
- Score: 0.34530027457862006
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The interpretability of models has become a crucial issue in Machine Learning
because of algorithmic decisions' growing impact on real-world applications.
Tree ensemble methods, such as Random Forests or XgBoost, are powerful learning
tools for classification tasks. However, while combining multiple trees may
provide higher prediction quality than a single one, it sacrifices the
interpretability property resulting in "black-box" models. In light of this, we
aim to develop an interpretable representation of a tree-ensemble model that
can provide valuable insights into its behavior. First, given a target
tree-ensemble model, we develop a hierarchical visualization tool based on a
heatmap representation of the forest's feature use, considering the frequency
of a feature and the level at which it is selected as an indicator of
importance. Next, we propose a mixed-integer linear programming (MILP)
formulation for constructing a single optimal multivariate tree that accurately
mimics the target model predictions. The goal is to provide an interpretable
surrogate model based on oblique hyperplane splits, which uses only the most
relevant features according to the defined forest's importance indicators. The
MILP model includes a penalty on feature selection based on their frequency in
the forest to further induce sparsity of the splits. The natural formulation
has been strengthened to improve the computational performance of
{mixed-integer} software. Computational experience is carried out on benchmark
datasets from the UCI repository using a state-of-the-art off-the-shelf solver.
Results show that the proposed model is effective in yielding a shallow
interpretable tree approximating the tree-ensemble decision function.
Related papers
- A Unified Approach to Extract Intepretable Rules from Tree Ensembles via Integer Programming [2.1408617023874443]
Tree ensemble methods are known for their effectiveness in supervised classification and regression tasks.
Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model.
arXiv Detail & Related papers (2024-06-30T22:33:47Z) - Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning [53.241569810013836]
We propose a new framework based on large language models (LLMs) and decision Tree reasoning (OCTree)
Our key idea is to leverage LLMs' reasoning capabilities to find good feature generation rules without manually specifying the search space.
Our empirical results demonstrate that this simple framework consistently enhances the performance of various prediction models.
arXiv Detail & Related papers (2024-06-12T08:31:34Z) - GrootVL: Tree Topology is All You Need in State Space Model [66.36757400689281]
GrootVL is a versatile multimodal framework that can be applied to both visual and textual tasks.
Our method significantly outperforms existing structured state space models on image classification, object detection and segmentation.
By fine-tuning large language models, our approach achieves consistent improvements in multiple textual tasks at minor training cost.
arXiv Detail & Related papers (2024-06-04T15:09:29Z) - Feature graphs for interpretable unsupervised tree ensembles: centrality, interaction, and application in disease subtyping [0.24578723416255746]
Feature selection assumes a pivotal role in enhancing model interpretability.
The accuracy gained from aggregating decision trees comes at the expense of interpretability.
The study introduces novel methods to construct feature graphs from unsupervised random forests.
arXiv Detail & Related papers (2024-04-27T12:47:37Z) - ViTree: Single-path Neural Tree for Step-wise Interpretable Fine-grained
Visual Categorization [56.37520969273242]
We introduce ViTree, a novel approach for fine-grained visual categorization.
By traversing the tree paths, ViTree effectively selects patches from transformer-processed features to highlight informative local regions.
This patch and path selectivity enhances model interpretability of ViTree, enabling better insights into the model's inner workings.
arXiv Detail & Related papers (2024-01-30T14:32:25Z) - On marginal feature attributions of tree-based models [0.11184789007828977]
Local feature attributions based on marginal expectations, e.g. marginal Shapley, Owen or Banzhaf values, may be employed.
We present two (statistically similar) decision trees that compute the exact same function for which the "path-dependent" TreeSHAP yields different rankings of features.
We exploit the symmetry to derive an explicit formula, with improved complexity and only in terms of the internal model parameters, for marginal Shapley (and Banzhaf and Owen) values of CatBoost models.
arXiv Detail & Related papers (2023-02-16T17:18:03Z) - Explaining random forest prediction through diverse rulesets [0.0]
Local Tree eXtractor (LTreeX) is able to explain the forest prediction for a given test instance with a few diverse rules.
We show that our proposed approach substantially outperforms other explainable methods in terms of predictive performance.
arXiv Detail & Related papers (2022-03-29T12:54:57Z) - Deep Reinforcement Learning of Graph Matching [63.469961545293756]
Graph matching (GM) under node and pairwise constraints has been a building block in areas from optimization to computer vision.
We present a reinforcement learning solver for GM i.e. RGM that seeks the node correspondence between pairwise graphs.
Our method differs from the previous deep graph matching model in the sense that they are focused on the front-end feature extraction and affinity function learning.
arXiv Detail & Related papers (2020-12-16T13:48:48Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Tree-AMP: Compositional Inference with Tree Approximate Message Passing [23.509275850721778]
Tree-AMP is a python package for compositional inference in high-dimensional tree-structured models.
The package provides a unifying framework to study several approximate message passing algorithms.
arXiv Detail & Related papers (2020-04-03T13:51:10Z) - ENTMOOT: A Framework for Optimization over Ensemble Tree Models [57.98561336670884]
ENTMOOT is a framework for integrating tree models into larger optimization problems.
We show how ENTMOOT allows a simple integration of tree models into decision-making and black-box optimization.
arXiv Detail & Related papers (2020-03-10T14:34:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.