Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers
- URL: http://arxiv.org/abs/2402.01502v1
- Date: Fri, 2 Feb 2024 15:36:43 GMT
- Title: Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers
- Authors: Alicia Curth and Alan Jeffares and Mihaela van der Schaar
- Abstract summary: We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
- Score: 68.76846801719095
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite their remarkable effectiveness and broad application, the drivers of
success underlying ensembles of trees are still not fully understood. In this
paper, we highlight how interpreting tree ensembles as adaptive and
self-regularizing smoothers can provide new intuition and deeper insight to
this topic. We use this perspective to show that, when studied as smoothers,
randomized tree ensembles not only make predictions that are quantifiably more
smooth than the predictions of the individual trees they consist of, but also
further regulate their smoothness at test-time based on the dissimilarity
between testing and training inputs. First, we use this insight to revisit,
refine and reconcile two recent explanations of forest success by providing a
new way of quantifying the conjectured behaviors of tree ensembles objectively
by measuring the effective degree of smoothing they imply. Then, we move beyond
existing explanations for the mechanisms by which tree ensembles improve upon
individual trees and challenge the popular wisdom that the superior performance
of forests should be understood as a consequence of variance reduction alone.
We argue that the current high-level dichotomy into bias- and
variance-reduction prevalent in statistics is insufficient to understand tree
ensembles -- because the prevailing definition of bias does not capture
differences in the expressivity of the hypothesis classes formed by trees and
forests. Instead, we show that forests can improve upon trees by three distinct
mechanisms that are usually implicitly entangled. In particular, we demonstrate
that the smoothing effect of ensembling can reduce variance in predictions due
to noise in outcome generation, reduce variability in the quality of the
learned function given fixed input data and reduce potential bias in learnable
functions by enriching the available hypothesis space.
Related papers
- Distilling interpretable causal trees from causal forests [0.0]
A high-dimensional distribution of conditional average treatment effects may give accurate, individual-level estimates.
This paper proposes the Distilled Causal Tree, a method for distilling a single, interpretable causal tree from a causal forest.
arXiv Detail & Related papers (2024-08-02T05:48:15Z) - Ensembles of Probabilistic Regression Trees [46.53457774230618]
Tree-based ensemble methods have been successfully used for regression problems in many applications and research studies.
We study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect to a probability distribution.
arXiv Detail & Related papers (2024-06-20T06:51:51Z) - Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - Theoretical and Empirical Advances in Forest Pruning [0.0]
We revisit forest pruning, an approach that aims to have the best of both worlds: the accuracy of regression forests and the interpretability of regression trees.
We prove the advantage of a Lasso-pruned forest over its unpruned counterpart under extremely weak assumptions.
We find that in the vast majority of scenarios tested, there is at least one forest-pruning method that yields equal or better accuracy than the original full forest.
arXiv Detail & Related papers (2024-01-10T20:02:47Z) - On the Pointwise Behavior of Recursive Partitioning and Its Implications
for Heterogeneous Causal Effect Estimation [8.394633341978007]
Decision tree learning is increasingly being used for pointwise inference.
We show that adaptive decision trees can fail to achieve convergence rates of convergence in the norm with non-vanishing probability.
We show that random forests can remedy the situation, turning poor performing trees into nearly optimal procedures.
arXiv Detail & Related papers (2022-11-19T21:28:30Z) - Hybrid Predictive Coding: Inferring, Fast and Slow [62.997667081978825]
We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner.
We demonstrate that our model is inherently sensitive to its uncertainty and adaptively balances balances to obtain accurate beliefs using minimum computational expense.
arXiv Detail & Related papers (2022-04-05T12:52:45Z) - Trees, Forests, Chickens, and Eggs: When and Why to Prune Trees in a
Random Forest [8.513154770491898]
We argue that tree depth should be seen as a natural form of regularization across the entire procedure.
In particular, our work suggests that random forests with shallow trees are advantageous when the signal-to-noise ratio in the data is low.
arXiv Detail & Related papers (2021-03-30T21:57:55Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Achieving Reliable Causal Inference with Data-Mined Variables: A Random
Forest Approach to the Measurement Error Problem [1.5749416770494704]
A common empirical strategy involves the application of predictive modeling techniques to'mine' variables of interest from available data.
Recent work highlights that, because the predictions from machine learning models are inevitably imperfect, econometric analyses based on the predicted variables are likely to suffer from bias due to measurement error.
We propose a novel approach to mitigate these biases, leveraging the ensemble learning technique known as the random forest.
arXiv Detail & Related papers (2020-12-19T21:48:23Z) - Rectified Decision Trees: Exploring the Landscape of Interpretable and
Effective Machine Learning [66.01622034708319]
We propose a knowledge distillation based decision trees extension, dubbed rectified decision trees (ReDT)
We extend the splitting criteria and the ending condition of the standard decision trees, which allows training with soft labels.
We then train the ReDT based on the soft label distilled from a well-trained teacher model through a novel jackknife-based method.
arXiv Detail & Related papers (2020-08-21T10:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.