Related papers: Optimal Weighted Random Forests

Optimal Weighted Random Forests

URL: http://arxiv.org/abs/2305.10042v1
Date: Wed, 17 May 2023 08:36:43 GMT
Title: Optimal Weighted Random Forests
Authors: Xinyu Chen, Dalei Yu, Xinyu Zhang
Abstract summary: The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy. We propose two optimal algorithms, namely the 1 Step Weighted RF (1step-WRF$_mathrmopt$) and 2 Steps Optimal Weighted RF (2steps-WRF$_mathrmopt$) Numerical studies conducted on real-world data sets indicate that these algorithms outperform the equal-weight forest and two other weighted RFs proposed in existing literature.
Score: 8.962539518822684
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy. In RF, it is conventional to put equal weights on all the base learners (trees) to aggregate their predictions. However, the predictive performances of different trees within the forest can be very different due to the randomization of the embedded bootstrap sampling and feature selection. In this paper, we focus on RF for regression and propose two optimal weighting algorithms, namely the 1 Step Optimal Weighted RF (1step-WRF$_\mathrm{opt}$) and 2 Steps Optimal Weighted RF (2steps-WRF$_\mathrm{opt}$), that combine the base learners through the weights determined by weight choice criteria. Under some regularity conditions, we show that these algorithms are asymptotically optimal in the sense that the resulting squared loss and risk are asymptotically identical to those of the infeasible but best possible model averaging estimator. Numerical studies conducted on real-world data sets indicate that these algorithms outperform the equal-weight forest and two other weighted RFs proposed in existing literature in most cases.

Related papers

Automatic Doubly Robust Forests [18.700557484394544]
This paper proposes the automatic Doubly Robust Random Forest (DRRF) algorithm for estimating the conditional expectation of a moment functional. DRRF combines the automatic debiasing framework using the Riesz representer. We demonstrate the superior performance of DRRF over benchmark approaches in terms of estimation accuracy, robustness, and computational efficiency.
arXiv Detail & Related papers (2024-12-10T04:45:50Z)
Adaptive Split Balancing for Optimal Random Forest [8.916614661563893]
We propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method. Our method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data.
arXiv Detail & Related papers (2024-02-17T09:10:40Z)
Selection of the Most Probable Best [2.1095005405219815]
We consider an expected-value ranking and selection (R&S) problem where all k solutions' simulation outputs depend on a common parameter whose uncertainty can be modeled by a distribution. We define the most probable best (MPB) to be the solution that has the largest probability of being optimal with respect to the distribution. We devise a series of algorithms that replace the unknown means in the optimality conditions with their estimates and prove the algorithms' sampling ratios achieve the conditions as the simulation budget increases.
arXiv Detail & Related papers (2022-07-15T15:27:27Z)
WildWood: a new Random Forest algorithm [0.0]
WildWood is a new ensemble algorithm for supervised learning of Random Forest (RF) type. WW uses bootstrap out-of-bag samples to compute out-of-bag scores. WW is fast and competitive compared with other well-established ensemble methods.
arXiv Detail & Related papers (2021-09-16T14:36:56Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters. Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z)
AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network [75.44925576268052]
The linear-chain Conditional Random Field (CRF) model is one of the most widely-used neural sequence labeling approaches. Exact probabilistic inference algorithms are typically applied in training and prediction stages of the CRF model. We propose to employ a parallelizable approximate variational inference algorithm for the CRF model.
arXiv Detail & Related papers (2020-09-17T12:18:43Z)
Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large. Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z)
Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm. We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z)
An Index-based Deterministic Asymptotically Optimal Algorithm for Constrained Multi-armed Bandit Problems [0.0]
For the model of constrained multi-armed bandit, we show that there exists an index-based deterministically optimal algorithm. We provide a finite-time bound to the probability of the optimality given as 1-O(|A|Te-T) where T is the horizon size and A is the set of the arms in the bandit.
arXiv Detail & Related papers (2020-07-29T01:54:22Z)
Stochastic batch size for adaptive regularization in deep network optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework. We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.