Related papers: ForestPrune: Compact Depth-Controlled Tree Ensembles

ForestPrune: Compact Depth-Controlled Tree Ensembles

URL: http://arxiv.org/abs/2206.00128v3
Date: Wed, 24 May 2023 20:17:05 GMT
Title: ForestPrune: Compact Depth-Controlled Tree Ensembles
Authors: Brian Liu and Rahul Mazumder
Abstract summary: We present ForestPrune, a novel framework to post-process tree ensembles by pruning depth layers from individual trees. We develop a specialized optimization algorithm to efficiently obtain high-quality solutions to problems under ForestPrune. Our experiments demonstrate that ForestPrune produces parsimonious models that outperform models extracted by existing post-processing algorithms.
Score: 7.538482310185135
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Tree ensembles are powerful models that achieve excellent predictive performances, but can grow to unwieldy sizes. These ensembles are often post-processed (pruned) to reduce memory footprint and improve interpretability. We present ForestPrune, a novel optimization framework to post-process tree ensembles by pruning depth layers from individual trees. Since the number of nodes in a decision tree increases exponentially with tree depth, pruning deep trees drastically compactifies ensembles. We develop a specialized optimization algorithm to efficiently obtain high-quality solutions to problems under ForestPrune. Our algorithm typically reaches good solutions in seconds for medium-size datasets and ensembles, with 10000s of rows and 100s of trees, resulting in significant speedups over existing approaches. Our experiments demonstrate that ForestPrune produces parsimonious models that outperform models extracted by existing post-processing algorithms.

Related papers

Experiments with Optimal Model Trees [2.8391355909797644]
We show that globally optimal model trees can achieve competitive accuracy with very small trees. We also compare to classic optimal and greedily grown decision trees, random forests, and support vector machines.
arXiv Detail & Related papers (2025-03-17T08:03:47Z)
Autoregressive Generation of Static and Growing Trees [49.93294993975928]
We propose a transformer architecture and training strategy for tree generation. The architecture processes data at multiple resolutions and has an hourglass shape, with middle layers processing fewer tokens than outer layers. We extend this approach to perform image-to-tree and point-cloud-to-tree conditional generation and to simulate the tree growth processes, generating 4D trees.
arXiv Detail & Related papers (2025-02-07T08:51:14Z)
Can a Single Tree Outperform an Entire Forest? [5.448070998907116]
The prevailing mindset is that a single decision tree underperforms classic random forests in testing accuracy. This study challenges such a mindset by significantly improving the testing accuracy of an oblique regression tree. Our approach reformulates tree training as a differentiable unconstrained optimization task.
arXiv Detail & Related papers (2024-11-26T00:18:18Z)
Free Lunch in the Forest: Functionally-Identical Pruning of Boosted Tree Ensembles [45.962492329047215]
We introduce a method to prune a tree ensemble into a reduced version that is "functionally identical" to the original model. We formalize the problem of functionally identical pruning on ensembles, introduce an exact optimization model, and provide a fast yet highly effective method to prune large ensembles.
arXiv Detail & Related papers (2024-08-28T23:15:46Z)
Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees. We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z)
MAPTree: Beating "Optimal" Decision Trees with Bayesian Decision Trees [2.421336072915701]
We present a Bayesian approach to decision tree induction via maximum a posteriori inference of a posterior distribution over trees. We propose an AND/OR search algorithm, dubbed MAPTree, which is able to recover the maximum a posteriori tree.
arXiv Detail & Related papers (2023-09-26T23:43:37Z)
Bayesian post-hoc regularization of random forests [0.0]
Random Forests are powerful ensemble learning algorithms widely used in various machine learning tasks. We propose post-hoc regularization to leverage the reliable patterns captured by leaf nodes closer to the root. We have evaluated the performance of our method on various machine learning data sets.
arXiv Detail & Related papers (2023-06-06T14:15:29Z)
Growing Deep Forests Efficiently with Soft Routing and Learned Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects. We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions. Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z)
Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary. We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z)
MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.