ForestPrune: Compact Depth-Controlled Tree Ensembles
- URL: http://arxiv.org/abs/2206.00128v3
- Date: Wed, 24 May 2023 20:17:05 GMT
- Title: ForestPrune: Compact Depth-Controlled Tree Ensembles
- Authors: Brian Liu and Rahul Mazumder
- Abstract summary: We present ForestPrune, a novel framework to post-process tree ensembles by pruning depth layers from individual trees.
We develop a specialized optimization algorithm to efficiently obtain high-quality solutions to problems under ForestPrune.
Our experiments demonstrate that ForestPrune produces parsimonious models that outperform models extracted by existing post-processing algorithms.
- Score: 7.538482310185135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tree ensembles are powerful models that achieve excellent predictive
performances, but can grow to unwieldy sizes. These ensembles are often
post-processed (pruned) to reduce memory footprint and improve
interpretability. We present ForestPrune, a novel optimization framework to
post-process tree ensembles by pruning depth layers from individual trees.
Since the number of nodes in a decision tree increases exponentially with tree
depth, pruning deep trees drastically compactifies ensembles. We develop a
specialized optimization algorithm to efficiently obtain high-quality solutions
to problems under ForestPrune. Our algorithm typically reaches good solutions
in seconds for medium-size datasets and ensembles, with 10000s of rows and 100s
of trees, resulting in significant speedups over existing approaches. Our
experiments demonstrate that ForestPrune produces parsimonious models that
outperform models extracted by existing post-processing algorithms.
Related papers
- Can a Single Tree Outperform an Entire Forest? [5.448070998907116]
The prevailing mindset is that a single decision tree underperforms classic random forests in testing accuracy.
This study challenges such a mindset by significantly improving the testing accuracy of an oblique regression tree.
Our approach reformulates tree training as a differentiable unconstrained optimization task.
arXiv Detail & Related papers (2024-11-26T00:18:18Z) - Free Lunch in the Forest: Functionally-Identical Pruning of Boosted Tree Ensembles [45.962492329047215]
We introduce a method to prune a tree ensemble into a reduced version that is "functionally identical" to the original model.
We formalize the problem of functionally identical pruning on ensembles, introduce an exact optimization model, and provide a fast yet highly effective method to prune large ensembles.
arXiv Detail & Related papers (2024-08-28T23:15:46Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - MAPTree: Beating "Optimal" Decision Trees with Bayesian Decision Trees [2.421336072915701]
We present a Bayesian approach to decision tree induction via maximum a posteriori inference of a posterior distribution over trees.
We propose an AND/OR search algorithm, dubbed MAPTree, which is able to recover the maximum a posteriori tree.
arXiv Detail & Related papers (2023-09-26T23:43:37Z) - Bayesian post-hoc regularization of random forests [0.0]
Random Forests are powerful ensemble learning algorithms widely used in various machine learning tasks.
We propose post-hoc regularization to leverage the reliable patterns captured by leaf nodes closer to the root.
We have evaluated the performance of our method on various machine learning data sets.
arXiv Detail & Related papers (2023-06-06T14:15:29Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary.
We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives.
We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.