Improving the Accuracy-Memory Trade-Off of Random Forests Via
Leaf-Refinement
- URL: http://arxiv.org/abs/2110.10075v1
- Date: Tue, 19 Oct 2021 16:06:43 GMT
- Title: Improving the Accuracy-Memory Trade-Off of Random Forests Via
Leaf-Refinement
- Authors: Sebastian Buschj\"ager, Katharina Morik
- Abstract summary: Random Forests (RF) are among the state-of-the-art in many machine learning applications.
We show that the improvement effects of pruning diminish for ensembles of large trees but that pruning has an overall better accuracy-memory trade-off than RF.
We present a simple, yet surprisingly effective algorithm that refines the predictions in the leaf nodes in the forest via gradient descent.
- Score: 6.967385165474138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Random Forests (RF) are among the state-of-the-art in many machine learning
applications. With the ongoing integration of ML models into everyday life, the
deployment and continuous application of models becomes more and more an
important issue. Hence, small models which offer good predictive performance
but use small amounts of memory are required. Ensemble pruning is a standard
technique to remove unnecessary classifiers from an ensemble to reduce the
overall resource consumption and sometimes even improve the performance of the
original ensemble. In this paper, we revisit ensemble pruning in the context of
`modernly' trained Random Forests where trees are very large. We show that the
improvement effects of pruning diminishes for ensembles of large trees but that
pruning has an overall better accuracy-memory trade-off than RF. However,
pruning does not offer fine-grained control over this trade-off because it
removes entire trees from the ensemble. To further improve the accuracy-memory
trade-off we present a simple, yet surprisingly effective algorithm that
refines the predictions in the leaf nodes in the forest via stochastic gradient
descent. We evaluate our method against 7 state-of-the-art pruning methods and
show that our method outperforms the other methods on 11 of 16 datasets with a
statistically significant better accuracy-memory trade-off compared to most
methods. We conclude our experimental evaluation with a case study showing that
our method can be applied in a real-world setting.
Related papers
- Free Lunch in the Forest: Functionally-Identical Pruning of Boosted Tree Ensembles [45.962492329047215]
We introduce a method to prune a tree ensemble into a reduced version that is "functionally identical" to the original model.
We formalize the problem of functionally identical pruning on ensembles, introduce an exact optimization model, and provide a fast yet highly effective method to prune large ensembles.
arXiv Detail & Related papers (2024-08-28T23:15:46Z) - Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient [57.9629676017527]
We propose an optimization-based structural pruning on Large-Language Models.
We learn the pruning masks in a probabilistic space directly by optimizing the loss of the pruned model.
Our method operates for 2.7 hours with around 35GB memory for the 13B models on a single A100 GPU.
arXiv Detail & Related papers (2024-06-15T09:31:03Z) - PUMA: margin-based data pruning [51.12154122266251]
We focus on data pruning, where some training samples are removed based on the distance to the model classification boundary (i.e., margin)
We propose PUMA, a new data pruning strategy that computes the margin using DeepFool.
We show that PUMA can be used on top of the current state-of-the-art methodology in robustness, and it is able to significantly improve the model performance unlike the existing data pruning strategies.
arXiv Detail & Related papers (2024-05-10T08:02:20Z) - Adaptive Split Balancing for Optimal Random Forest [8.916614661563893]
We propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method.
Our method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data.
arXiv Detail & Related papers (2024-02-17T09:10:40Z) - Bayesian post-hoc regularization of random forests [0.0]
Random Forests are powerful ensemble learning algorithms widely used in various machine learning tasks.
We propose post-hoc regularization to leverage the reliable patterns captured by leaf nodes closer to the root.
We have evaluated the performance of our method on various machine learning data sets.
arXiv Detail & Related papers (2023-06-06T14:15:29Z) - Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model [89.8764435351222]
We propose a new family of unbiased estimators called WTA-CRS, for matrix production with reduced variance.
Our work provides both theoretical and experimental evidence that, in the context of tuning transformers, our proposed estimators exhibit lower variance compared to existing ones.
arXiv Detail & Related papers (2023-05-24T15:52:08Z) - A cautionary tale on fitting decision trees to data from additive
models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models.
This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data.
We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z) - MLPruning: A Multilevel Structured Pruning Framework for
Transformer-based Models [78.45898846056303]
Pruning is an effective method to reduce the memory footprint and computational cost associated with large natural language processing models.
We develop a novel MultiLevel structured Pruning framework, which uses three different levels of structured pruning: head pruning, row pruning, and block-wise sparse pruning.
arXiv Detail & Related papers (2021-05-30T22:00:44Z) - Residual Likelihood Forests [19.97069303172077]
This paper presents a novel ensemble learning approach called Residual Likelihood Forests (RLF)
Our weak learners produce conditional likelihoods that are sequentially optimized using global loss in the context of previous learners.
When compared against several ensemble approaches including Random Forests and Gradient Boosted Trees, RLFs offer a significant improvement in performance.
arXiv Detail & Related papers (2020-11-04T00:59:41Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - Movement Pruning: Adaptive Sparsity by Fine-Tuning [115.91907953454034]
Magnitude pruning is a widely used strategy for reducing model size in pure supervised learning.
We propose the use of movement pruning, a simple, deterministic first-order weight pruning method.
Experiments show that when pruning large pretrained language models, movement pruning shows significant improvements in high-sparsity regimes.
arXiv Detail & Related papers (2020-05-15T17:54:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.