Related papers: TREX: Tree-Ensemble Representer-Point Explanations

TREX: Tree-Ensemble Representer-Point Explanations

URL: http://arxiv.org/abs/2009.05530v3
Date: Thu, 16 Dec 2021 22:53:22 GMT
Title: TREX: Tree-Ensemble Representer-Point Explanations
Authors: Jonathan Brophy and Daniel Lowd
Abstract summary: TREX is an explanation system that provides instance-attribution explanations for tree ensembles. Since tree ensembles are non-differentiable, we define a kernel that captures the structure of the specific tree ensemble. The weights in the kernel expansion of the surrogate model are used to define the global or local importance of each training example.
Score: 13.109852233032395
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: How can we identify the training examples that contribute most to the prediction of a tree ensemble? In this paper, we introduce TREX, an explanation system that provides instance-attribution explanations for tree ensembles, such as random forests and gradient boosted trees. TREX builds on the representer point framework previously developed for explaining deep neural networks. Since tree ensembles are non-differentiable, we define a kernel that captures the structure of the specific tree ensemble. By using this kernel in kernel logistic regression or a support vector machine, TREX builds a surrogate model that approximates the original tree ensemble. The weights in the kernel expansion of the surrogate model are used to define the global or local importance of each training example. Our experiments show that TREX's surrogate model accurately approximates the tree ensemble; its global importance weights are more effective in dataset debugging than the previous state-of-the-art; its explanations identify the most influential samples better than alternative methods under the remove and retrain evaluation framework; it runs orders of magnitude faster than alternative methods; and its local explanations can identify and explain errors due to domain mismatch.

Related papers

Hierarchical Quantized Diffusion Based Tree Generation Method for Hierarchical Representation and Lineage Analysis [49.00783841494125]
HDTree captures tree relationships within a hierarchical latent space using a unified hierarchical codebook and quantized diffusion processes.<n> HDTree's effectiveness is demonstrated through comparisons on both general-purpose and single-cell datasets.<n>These contributions provide a new tool for hierarchical lineage analysis, enabling more accurate and efficient modeling of cellular differentiation paths.
arXiv Detail & Related papers (2025-06-29T15:19:13Z)
Birch SGD: A Tree Graph Framework for Local and Asynchronous SGD Methods [51.54704494242525]
We propose a new unifying framework, Birch SGD, for analyzing and designing distributed SGD methods.<n>Using Birch SGD, we design eight new methods and analyze them alongside previously known ones, with at least six of the new methods shown to have optimal computational time complexity.<n>Our research leads to two key insights: (i) all methods share the same "iteration rate" of $Oleft(frac(R + 1) L Deltavarepsilon + fracsigma2 L Deltavarepsilon2right)$, where $R$
arXiv Detail & Related papers (2025-05-14T08:37:45Z)
ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval [64.44265315244579]
We propose a tree-based method for organizing and representing reference documents at various granular levels. Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches. Our evaluations show that ReTreever generally preserves full representation accuracy.
arXiv Detail & Related papers (2025-02-11T21:35:13Z)
Decision Trees for Interpretable Clusters in Mixture Models and Deep Representations [5.65604054654671]
We introduce the notion of an explainability-to-noise ratio for mixture models. We propose an algorithm that takes as input a mixture model and constructs a suitable tree in data-independent time. We prove upper and lower bounds on the error rate of the resulting decision tree.
arXiv Detail & Related papers (2024-11-03T14:00:20Z)
Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees. We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z)
ViTree: Single-path Neural Tree for Step-wise Interpretable Fine-grained Visual Categorization [56.37520969273242]
We introduce ViTree, a novel approach for fine-grained visual categorization. By traversing the tree paths, ViTree effectively selects patches from transformer-processed features to highlight informative local regions. This patch and path selectivity enhances model interpretability of ViTree, enabling better insights into the model's inner workings.
arXiv Detail & Related papers (2024-01-30T14:32:25Z)
Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure. We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance. We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z)
Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior. The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z)
RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees [47.745218107037786]
We propose RLET, a Reinforcement Learning based Entailment Tree generation framework. RLET iteratively performs single step reasoning with sentence selection and deduction generation modules. Experiments on three settings of the EntailmentBank dataset demonstrate the strength of using RL framework.
arXiv Detail & Related papers (2022-10-31T06:45:05Z)
Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner [56.08919422452905]
We propose an architecture called Iterative Retrieval-Generation Reasoner (IRGR) Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises. We outperform existing benchmarks on premise retrieval and entailment tree generation, with around 300% gain in overall correctness.
arXiv Detail & Related papers (2022-05-18T21:52:11Z)
Hierarchical Shrinkage: improving the accuracy and interpretability of tree-based methods [10.289846887751079]
We introduce Hierarchical Shrinkage (HS), a post-hoc algorithm that does not modify the tree structure. HS substantially increases the predictive performance of decision trees, even when used in conjunction with other regularization techniques. All code and models are released in a full-fledged package available on Github.
arXiv Detail & Related papers (2022-02-02T02:43:23Z)
A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models. This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data. We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z)
Structural Optimization Makes Graph Classification Simpler and Better [5.770986723520119]
We investigate the feasibility of improving graph classification performance while simplifying the model learning process. Inspired by progress in structural information assessment, we optimize the given data sample from graphs to encoding trees. We present an implementation of the scheme in a tree kernel and a convolutional network to perform graph classification.
arXiv Detail & Related papers (2021-09-05T08:54:38Z)
Spectral Top-Down Recovery of Latent Tree Models [13.681975313065477]
Spectral Top-Down Recovery (STDR) is a divide-and-conquer approach for inference of large latent tree models. STDR's partitioning step is non-random. Instead, it is based on the Fiedler vector of a suitable Laplacian matrix related to the observed nodes. We prove that STDR is statistically consistent, and bound the number of samples required to accurately recover the tree with high probability.
arXiv Detail & Related papers (2021-02-26T02:47:42Z)
Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data. We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.