Related papers: Random Forests as Statistical Procedures: Design, Variance, and Dependence

Random Forests as Statistical Procedures: Design, Variance, and Dependence

URL: http://arxiv.org/abs/2602.13104v2
Date: Tue, 17 Feb 2026 17:50:58 GMT
Title: Random Forests as Statistical Procedures: Design, Variance, and Dependence
Authors: Nathaniel S. O'Connell,
Abstract summary: We develop a finite-sample, design-based formulation of random forests in which each tree is an explicit randomized conditional regression function.<n>This perspective yields an exact variance identity for the forest predictor that separates finite-aggregation variability from a structural dependence term.<n>The resulting framework clarifies how resampling, feature-level randomization, and split selection govern resolution, tree variability, and dependence.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Random forests are widely used prediction procedures, yet are typically described algorithmically rather than as statistical designs acting on a fixed set of covariates. We develop a finite-sample, design-based formulation of random forests in which each tree is an explicit randomized conditional regression function. This perspective yields an exact variance identity for the forest predictor that separates finite-aggregation variability from a structural dependence term that persists even under infinite aggregation. We further decompose both single-tree dispersion and inter-tree covariance using the laws of total variance and covariance, isolating two fundamental design mechanisms-reuse of training observations and alignment of data-adaptive partitions. These mechanisms induce a strict covariance floor, demonstrating that predictive variability cannot be eliminated by increasing the number of trees alone. The resulting framework clarifies how resampling, feature-level randomization, and split selection govern resolution, tree variability, and dependence, and establishes random forests as explicit finite-sample statistical designs whose behavior is determined by their underlying randomized construction.

Related papers

Partition Trees: Conditional Density Estimation over General Outcome Spaces [46.1988967916659]
We propose Partition Trees, a tree-based framework for conditional density estimation over general outcome spaces.<n>Our approach models conditional distributions as piecewise-constant densities on data adaptive partitions and learns trees by directly minimizing conditional negative log-likelihood.
arXiv Detail & Related papers (2026-02-03T22:12:30Z)
Consistency of Honest Decision Trees and Random Forests [0.0]
We study various types of consistency of honest decision trees and random forests in the regression setting.<n>We establish weak and almost sure convergence of honest trees and honest forest averages to the true regression function.
arXiv Detail & Related papers (2026-01-21T13:40:36Z)
Identification of Causal Direction under an Arbitrary Number of Latent Confounders [54.76982125821112]
In real-world scenarios, observed variables may be affected by multiple latent variables simultaneously.<n>We make use of the joint higher-order cumulant matrix of the observed variables constructed in a specific way.<n>We show that, surprisingly, causal asymmetry between two observed variables can be directly seen from the rank deficiency properties of such higher-order cumulant matrices.
arXiv Detail & Related papers (2025-10-26T15:10:00Z)
Clustered random forests with correlated data for optimal estimation and inference under potential covariate shift [4.13592995550836]
We develop Clustered Random Forests, a random forests algorithm for clustered data, arising from independent groups that exhibit within-cluster dependence.<n>The leaf-wise predictions for each decision tree making up clustered random forests takes the form of a weighted least squares estimator.<n>Clustered random forests are shown for certain tree splitting criteria to be minimax rate optimal for pointwise conditional mean estimation.
arXiv Detail & Related papers (2025-03-16T20:07:23Z)
Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data.<n>We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z)
Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC.<n>We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss.<n> Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z)
Exogenous Randomness Empowering Random Forests [4.396860522241306]
We develop non-asymptotic expansions for the mean squared error (MSE) for both individual trees and forests. Our findings unveil that feature subsampling reduces both the bias and variance of random forests compared to individual trees. Our results reveal an intriguing phenomenon: the presence of noise features can act as a "blessing" in enhancing the performance of random forests.
arXiv Detail & Related papers (2024-11-12T05:06:10Z)
Why do Random Forests Work? Understanding Tree Ensembles as Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles. We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z)
On the Pointwise Behavior of Recursive Partitioning and Its Implications for Heterogeneous Causal Effect Estimation [8.394633341978007]
Decision tree learning is increasingly being used for pointwise inference. We show that adaptive decision trees can fail to achieve convergence rates of convergence in the norm with non-vanishing probability. We show that random forests can remedy the situation, turning poor performing trees into nearly optimal procedures.
arXiv Detail & Related papers (2022-11-19T21:28:30Z)
Random Forest Weighted Local Fréchet Regression with Random Objects [18.128663071848923]
We propose a novel random forest weighted local Fr'echet regression paradigm.<n>Our first method uses these weights as the local average to solve the conditional Fr'echet mean.<n>Second method performs local linear Fr'echet regression, both significantly improving existing Fr'echet regression methods.
arXiv Detail & Related papers (2022-02-10T09:10:59Z)
Which Invariance Should We Transfer? A Causal Minimax Learning Approach [18.71316951734806]
We present a comprehensive minimax analysis from a causal perspective. We propose an efficient algorithm to search for the subset with minimal worst-case risk. The effectiveness and efficiency of our methods are demonstrated on synthetic data and the diagnosis of Alzheimer's disease.
arXiv Detail & Related papers (2021-07-05T09:07:29Z)
Large Scale Prediction with Decision Trees [9.917147243076645]
This paper shows that decision trees constructed with Classification and Regression Trees (CART) and C4.5 methodology are consistent for regression and classification tasks. A key step in the analysis is the establishment of an oracle inequality, which allows for a precise characterization of the goodness-of-fit and complexity tradeoff for a mis-specified model.
arXiv Detail & Related papers (2021-04-28T16:59:03Z)
Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process. We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z)
Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions. Motivated by these theoretical results, we propose learning several approximate proposals for the best model. In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.