Related papers: Explainable Models via Compression of Tree Ensembles

Explainable Models via Compression of Tree Ensembles

URL: http://arxiv.org/abs/2206.07904v1
Date: Thu, 16 Jun 2022 04:03:55 GMT
Title: Explainable Models via Compression of Tree Ensembles
Authors: Siwen Yan, Sriraam Natarajan, Saket Joshi, Roni Khardon and Prasad Tadepalli
Abstract summary: We consider the problem of compressing a large set of learned trees into a single explainable model. CoTE -- Compression of Tree Ensembles -- produces a single small decision list as a compressed representation. An experimental evaluation demonstrates the effectiveness of CoTE in several benchmark relational data sets.
Score: 23.790618990173
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ensemble models (bagging and gradient-boosting) of relational decision trees have proved to be one of the most effective learning methods in the area of probabilistic logic models (PLMs). While effective, they lose one of the most important aspect of PLMs -- interpretability. In this paper we consider the problem of compressing a large set of learned trees into a single explainable model. To this effect, we propose CoTE -- Compression of Tree Ensembles -- that produces a single small decision list as a compressed representation. CoTE first converts the trees to decision lists and then performs the combination and compression with the aid of the original training set. An experimental evaluation demonstrates the effectiveness of CoTE in several benchmark relational data sets.

Related papers

Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data. We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z)
Optimizing Singular Spectrum for Large Language Model Compression [95.7621116637755]
We introduce SoCo, a novel compression framework that learns to rescale the decomposed components of SVD in a data-driven manner. Thanks to the learnable singular spectrum, SoCo adaptively prunes components according to the sparsified importance scores. Experimental evaluations across multiple LLMs and benchmarks demonstrate that SoCo surpasses the state-of-the-art methods in model compression.
arXiv Detail & Related papers (2025-02-20T23:18:39Z)
Decision Trees for Interpretable Clusters in Mixture Models and Deep Representations [5.65604054654671]
We introduce the notion of an explainability-to-noise ratio for mixture models. We propose an algorithm that takes as input a mixture model and constructs a suitable tree in data-independent time. We prove upper and lower bounds on the error rate of the resulting decision tree.
arXiv Detail & Related papers (2024-11-03T14:00:20Z)
Inherently Interpretable Tree Ensemble Learning [7.868733904112288]
We show that when shallow decision trees are used as base learners, the ensemble learning algorithms can become inherently interpretable. An interpretation algorithm is developed that converts the tree ensemble into the functional ANOVA representation with inherent interpretability. Experiments on simulations and real-world datasets show that our proposed methods offer a better trade-off between model interpretation and predictive performance.
arXiv Detail & Related papers (2024-10-24T18:58:41Z)
Free Lunch in the Forest: Functionally-Identical Pruning of Boosted Tree Ensembles [45.962492329047215]
We introduce a method to prune a tree ensemble into a reduced version that is "functionally identical" to the original model. We formalize the problem of functionally identical pruning on ensembles, introduce an exact optimization model, and provide a fast yet highly effective method to prune large ensembles.
arXiv Detail & Related papers (2024-08-28T23:15:46Z)
Learning Deep Tree-based Retriever for Efficient Recommendation: Theory and Method [76.31185707649227]
We propose a Deep Tree-based Retriever (DTR) for efficient recommendation. DTR frames the training task as a softmax-based multi-class classification over tree nodes at the same level. To mitigate the suboptimality induced by the labeling of non-leaf nodes, we propose a rectification method for the loss function.
arXiv Detail & Related papers (2024-08-21T05:09:53Z)
Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning [53.241569810013836]
We propose a novel framework that utilizes large language models (LLMs) to identify effective feature generation rules. We use decision trees to convey this reasoning information, as they can be easily represented in natural language. OCTree consistently enhances the performance of various prediction models across diverse benchmarks.
arXiv Detail & Related papers (2024-06-12T08:31:34Z)
Learning accurate and interpretable tree-based models [27.203303726977616]
We develop approaches to design tree-based learning algorithms given repeated access to data from the same domain.<n>We propose novel parameterized classes of node splitting criteria in top-down algorithms, which interpolate between popularly used entropy and Gini impurity based criteria.<n>We extend our results to tuning popular tree-based ensembles, including random forests and gradient-boosted trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z)
A Robust Hypothesis Test for Tree Ensemble Pruning [2.4923006485141284]
We develop and present a novel theoretically justified hypothesis test of split quality for gradient boosted tree ensembles. We show that using this method instead of the common penalty terms leads to a significant reduction in out of sample loss. We also present several innovative extensions to the method, opening the door for a wide variety of novel tree pruning algorithms.
arXiv Detail & Related papers (2023-01-24T16:31:49Z)
Subgroup Robustness Grows On Trees: An Empirical Baseline Investigation [13.458414200958797]
We conduct an empirical comparison of several previously-proposed methods for fair and robust learning alongside state-of-the-art tree-based methods. We show that tree-based methods have strong subgroup robustness, even when compared to robustness- and fairness-enhancing methods.
arXiv Detail & Related papers (2022-11-23T04:49:18Z)
TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding [56.794981024301094]
We propose a compositional data augmentation approach for natural language understanding called TreeMix. Specifically, TreeMix leverages constituency parsing tree to decompose sentences into constituent sub-structures and the Mixup data augmentation technique to recombine them to generate new sentences. Compared with previous approaches, TreeMix introduces greater diversity to the samples generated and encourages models to learn compositionality of NLP data.
arXiv Detail & Related papers (2022-05-12T15:25:12Z)
Robustifying Algorithms of Learning Latent Trees with Vector Variables [92.18777020401484]
We present the sample complexities of Recursive Grouping (RG) and Chow-Liu Recursive Grouping (CLRG) We robustify RG, CLRG, Neighbor Joining (NJ) and Spectral NJ (SNJ) by using the truncated inner product. We derive the first known instance-dependent impossibility result for structure learning of latent trees.
arXiv Detail & Related papers (2021-06-02T01:37:52Z)
Learning from Non-Binary Constituency Trees via Tensor Decomposition [12.069862650316262]
We introduce a new approach to deal with non-binary constituency trees. We show how a powerful composition function based on the canonical tensor decomposition can exploit such a rich structure. We experimentally assess its performance on different NLP tasks.
arXiv Detail & Related papers (2020-11-02T10:06:59Z)
Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary. We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.