Related papers: Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree

Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree

URL: http://arxiv.org/abs/2302.07580v2
Date: Thu, 18 Jan 2024 18:42:28 GMT
Title: Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree
Authors: Giulia Di Teodoro, Marta Monaci, Laura Palagi
Abstract summary: We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior. The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
Score: 0.34530027457862006
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The interpretability of models has become a crucial issue in Machine Learning because of algorithmic decisions' growing impact on real-world applications. Tree ensemble methods, such as Random Forests or XgBoost, are powerful learning tools for classification tasks. However, while combining multiple trees may provide higher prediction quality than a single one, it sacrifices the interpretability property resulting in "black-box" models. In light of this, we aim to develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior. First, given a target tree-ensemble model, we develop a hierarchical visualization tool based on a heatmap representation of the forest's feature use, considering the frequency of a feature and the level at which it is selected as an indicator of importance. Next, we propose a mixed-integer linear programming (MILP) formulation for constructing a single optimal multivariate tree that accurately mimics the target model predictions. The goal is to provide an interpretable surrogate model based on oblique hyperplane splits, which uses only the most relevant features according to the defined forest's importance indicators. The MILP model includes a penalty on feature selection based on their frequency in the forest to further induce sparsity of the splits. The natural formulation has been strengthened to improve the computational performance of {mixed-integer} software. Computational experience is carried out on benchmark datasets from the UCI repository using a state-of-the-art off-the-shelf solver. Results show that the proposed model is effective in yielding a shallow interpretable tree approximating the tree-ensemble decision function.

Related papers

RO-FIGS: Efficient and Expressive Tree-Based Ensembles for Tabular Data [10.610270769561811]
Tree-based models are robust to uninformative features and can accurately capture non-smooth, complex decision boundaries. We propose Random oblique Fast Interpretable Greedy-Tree Sums (RO-FIGS) RO-FIGS builds on Fast Interpretable Greedy-Tree Sums, and extends it by learning trees with oblique or multivariate splits. We evaluate RO-FIGS on 22 real-world datasets, demonstrating superior performance and much smaller models over other tree- and neural network-based methods.
arXiv Detail & Related papers (2025-04-09T14:35:24Z)
Experiments with Optimal Model Trees [2.8391355909797644]
We show that globally optimal model trees can achieve competitive accuracy with very small trees. We also compare to classic optimal and greedily grown decision trees, random forests, and support vector machines.
arXiv Detail & Related papers (2025-03-17T08:03:47Z)
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models [88.29990536278167]
We introduce SPaR, a self-play framework integrating tree-search self-refinement to yield valid and comparable preference pairs. Our experiments show that a LLaMA3-8B model, trained over three iterations guided by SPaR, surpasses GPT-4-Turbo on the IFEval benchmark without losing general capabilities.
arXiv Detail & Related papers (2024-12-16T09:47:43Z)
Inherently Interpretable Tree Ensemble Learning [7.868733904112288]
We show that when shallow decision trees are used as base learners, the ensemble learning algorithms can become inherently interpretable. An interpretation algorithm is developed that converts the tree ensemble into the functional ANOVA representation with inherent interpretability. Experiments on simulations and real-world datasets show that our proposed methods offer a better trade-off between model interpretation and predictive performance.
arXiv Detail & Related papers (2024-10-24T18:58:41Z)
A Unified Approach to Extract Interpretable Rules from Tree Ensembles via Integer Programming [2.1408617023874443]
Tree ensemble methods are known for their effectiveness in supervised classification and regression tasks. Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model.
arXiv Detail & Related papers (2024-06-30T22:33:47Z)
Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning [53.241569810013836]
We propose a novel framework that utilizes large language models (LLMs) to identify effective feature generation rules. We use decision trees to convey this reasoning information, as they can be easily represented in natural language. OCTree consistently enhances the performance of various prediction models across diverse benchmarks.
arXiv Detail & Related papers (2024-06-12T08:31:34Z)
Forecasting with Hyper-Trees [50.72190208487953]
Hyper-Trees are designed to learn the parameters of time series models. By relating the parameters of a target time series model to features, Hyper-Trees also address the issue of parameter non-stationarity. In this novel approach, the trees first generate informative representations from the input features, which a shallow network then maps to the target model parameters.
arXiv Detail & Related papers (2024-05-13T15:22:15Z)
Feature graphs for interpretable unsupervised tree ensembles: centrality, interaction, and application in disease subtyping [0.24578723416255746]
Feature selection assumes a pivotal role in enhancing model interpretability. The accuracy gained from aggregating decision trees comes at the expense of interpretability. The study introduces novel methods to construct feature graphs from unsupervised random forests.
arXiv Detail & Related papers (2024-04-27T12:47:37Z)
ViTree: Single-path Neural Tree for Step-wise Interpretable Fine-grained Visual Categorization [56.37520969273242]
We introduce ViTree, a novel approach for fine-grained visual categorization. By traversing the tree paths, ViTree effectively selects patches from transformer-processed features to highlight informative local regions. This patch and path selectivity enhances model interpretability of ViTree, enabling better insights into the model's inner workings.
arXiv Detail & Related papers (2024-01-30T14:32:25Z)
Explaining random forest prediction through diverse rulesets [0.0]
Local Tree eXtractor (LTreeX) is able to explain the forest prediction for a given test instance with a few diverse rules. We show that our proposed approach substantially outperforms other explainable methods in terms of predictive performance.
arXiv Detail & Related papers (2022-03-29T12:54:57Z)
Deep Reinforcement Learning of Graph Matching [63.469961545293756]
Graph matching (GM) under node and pairwise constraints has been a building block in areas from optimization to computer vision. We present a reinforcement learning solver for GM i.e. RGM that seeks the node correspondence between pairwise graphs. Our method differs from the previous deep graph matching model in the sense that they are focused on the front-end feature extraction and affinity function learning.
arXiv Detail & Related papers (2020-12-16T13:48:48Z)
MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
Tree-AMP: Compositional Inference with Tree Approximate Message Passing [23.509275850721778]
Tree-AMP is a python package for compositional inference in high-dimensional tree-structured models. The package provides a unifying framework to study several approximate message passing algorithms.
arXiv Detail & Related papers (2020-04-03T13:51:10Z)
ENTMOOT: A Framework for Optimization over Ensemble Tree Models [57.98561336670884]
ENTMOOT is a framework for integrating tree models into larger optimization problems. We show how ENTMOOT allows a simple integration of tree models into decision-making and black-box optimization.
arXiv Detail & Related papers (2020-03-10T14:34:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.