Related papers: Explaining random forest prediction through diverse rulesets

Explaining random forest prediction through diverse rulesets

URL: http://arxiv.org/abs/2203.15511v1
Date: Tue, 29 Mar 2022 12:54:57 GMT
Title: Explaining random forest prediction through diverse rulesets
Authors: Klest Dedja, Felipe Kenji Nakano, Konstantinos Pliakos, Celine Vens
Abstract summary: Local Tree eXtractor (LTreeX) is able to explain the forest prediction for a given test instance with a few diverse rules. We show that our proposed approach substantially outperforms other explainable methods in terms of predictive performance.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Tree-ensemble algorithms, such as random forest, are effective machine learning methods popular for their flexibility, high performance, and robustness to overfitting. However, since multiple learners are combined,they are not as interpretable as a single decision tree. In this work we propose a methodology, called Local Tree eXtractor (LTreeX) which is able to explain the forest prediction for a given test instance with a few diverse rules. Starting from the decision trees generated by a random forest, our method 1) pre-selects a subset of them, 2) creates a vector representation, and 3) eventually clusters such a representation. Each cluster prototype results in a rule that explains the test instance prediction. We test the effectiveness of LTreeX on 71 real-world datasets and we demonstrate the validity of our approach for binary classification, regression, multi-label classification and time-to-event tasks. In all set-ups, we show that our extracted surrogate model manages to approximate the performance of the corresponding ensemble model, while selecting only few trees from the whole forest.We also show that our proposed approach substantially outperforms other explainable methods in terms of predictive performance.

Related papers

Can a Single Tree Outperform an Entire Forest? [5.448070998907116]
The prevailing mindset is that a single decision tree underperforms classic random forests in testing accuracy. This study challenges such a mindset by significantly improving the testing accuracy of an oblique regression tree. Our approach reformulates tree training as a differentiable unconstrained optimization task.
arXiv Detail & Related papers (2024-11-26T00:18:18Z)
Binary Classification: Is Boosting stronger than Bagging? [5.877778007271621]
We introduce Enhanced Random Forests, an extension of vanilla Random Forests with extra functionalities and adaptive sample and model weighting. We develop an iterative algorithm for adapting the training sample weights, by favoring the hardest examples, and an approach for finding personalized tree weighting schemes for each new sample. Our method significantly improves upon regular Random Forests across 15 different binary classification datasets and considerably outperforms other tree methods, including XGBoost.
arXiv Detail & Related papers (2024-10-24T23:22:33Z)
Extending Explainable Ensemble Trees (E2Tree) to regression contexts [1.5186937600119894]
E2Tree is a novel methodology for explaining random forests. It accounts for the effects of predictor variables on the response. It also accounts for associations between the predictor variables through the computation and use of dissimilarity measures.
arXiv Detail & Related papers (2024-09-10T11:42:55Z)
A Unified Approach to Extract Interpretable Rules from Tree Ensembles via Integer Programming [2.1408617023874443]
Tree ensemble methods are known for their effectiveness in supervised classification and regression tasks. Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model.
arXiv Detail & Related papers (2024-06-30T22:33:47Z)
Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees. We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z)
Why do Random Forests Work? Understanding Tree Ensembles as Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles. We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z)
Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior. The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z)
A Mathematical Programming Approach to Optimal Classification Forests [1.0705399532413618]
We propose a novel mathematical optimization-based methodology in which a given number of trees are simultaneously constructed. The classification rule is derived by assigning to each observation its most frequently predicted class among the trees in the forest. We show that our proposed method has equal or superior performance compared with state-of-the-art tree-based classification methods.
arXiv Detail & Related papers (2022-11-18T20:33:08Z)
Contextual Decision Trees [62.997667081978825]
We propose a multi-armed contextual bandit recommendation framework for feature-based selection of a single shallow tree of the learned ensemble. The trained system, which works on top of the Random Forest, dynamically identifies a base predictor that is responsible for providing the final output.
arXiv Detail & Related papers (2022-07-13T17:05:08Z)
Social Interpretable Tree for Pedestrian Trajectory Prediction [75.81745697967608]
We propose a tree-based method, termed as Social Interpretable Tree (SIT), to address this multi-modal prediction task. A path in the tree from the root to leaf represents an individual possible future trajectory. Despite the hand-crafted tree, the experimental results on ETH-UCY and Stanford Drone datasets demonstrate that our method is capable of matching or exceeding the performance of state-of-the-art methods.
arXiv Detail & Related papers (2022-05-26T12:18:44Z)
Growing Deep Forests Efficiently with Soft Routing and Learned Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects. We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions. Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z)
MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.