Optimal Weighted Random Forests
- URL: http://arxiv.org/abs/2305.10042v1
- Date: Wed, 17 May 2023 08:36:43 GMT
- Title: Optimal Weighted Random Forests
- Authors: Xinyu Chen, Dalei Yu, Xinyu Zhang
- Abstract summary: The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy.
We propose two optimal algorithms, namely the 1 Step Weighted RF (1step-WRF$_mathrmopt$) and 2 Steps Optimal Weighted RF (2steps-WRF$_mathrmopt$)
Numerical studies conducted on real-world data sets indicate that these algorithms outperform the equal-weight forest and two other weighted RFs proposed in existing literature.
- Score: 8.962539518822684
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The random forest (RF) algorithm has become a very popular prediction method
for its great flexibility and promising accuracy. In RF, it is conventional to
put equal weights on all the base learners (trees) to aggregate their
predictions. However, the predictive performances of different trees within the
forest can be very different due to the randomization of the embedded bootstrap
sampling and feature selection. In this paper, we focus on RF for regression
and propose two optimal weighting algorithms, namely the 1 Step Optimal
Weighted RF (1step-WRF$_\mathrm{opt}$) and 2 Steps Optimal Weighted RF
(2steps-WRF$_\mathrm{opt}$), that combine the base learners through the weights
determined by weight choice criteria. Under some regularity conditions, we show
that these algorithms are asymptotically optimal in the sense that the
resulting squared loss and risk are asymptotically identical to those of the
infeasible but best possible model averaging estimator. Numerical studies
conducted on real-world data sets indicate that these algorithms outperform the
equal-weight forest and two other weighted RFs proposed in existing literature
in most cases.
Related papers
- Adaptive Split Balancing for Optimal Random Forest [8.916614661563893]
We propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method.
Our method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data.
arXiv Detail & Related papers (2024-02-17T09:10:40Z) - Selection of the Most Probable Best [2.1095005405219815]
We consider an expected-value ranking and selection (R&S) problem where all k solutions' simulation outputs depend on a common parameter whose uncertainty can be modeled by a distribution.
We define the most probable best (MPB) to be the solution that has the largest probability of being optimal with respect to the distribution.
We devise a series of algorithms that replace the unknown means in the optimality conditions with their estimates and prove the algorithms' sampling ratios achieve the conditions as the simulation budget increases.
arXiv Detail & Related papers (2022-07-15T15:27:27Z) - WildWood: a new Random Forest algorithm [0.0]
WildWood is a new ensemble algorithm for supervised learning of Random Forest (RF) type.
WW uses bootstrap out-of-bag samples to compute out-of-bag scores.
WW is fast and competitive compared with other well-established ensemble methods.
arXiv Detail & Related papers (2021-09-16T14:36:56Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z) - AIN: Fast and Accurate Sequence Labeling with Approximate Inference
Network [75.44925576268052]
The linear-chain Conditional Random Field (CRF) model is one of the most widely-used neural sequence labeling approaches.
Exact probabilistic inference algorithms are typically applied in training and prediction stages of the CRF model.
We propose to employ a parallelizable approximate variational inference algorithm for the CRF model.
arXiv Detail & Related papers (2020-09-17T12:18:43Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm.
We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z) - An Index-based Deterministic Asymptotically Optimal Algorithm for
Constrained Multi-armed Bandit Problems [0.0]
For the model of constrained multi-armed bandit, we show that there exists an index-based deterministically optimal algorithm.
We provide a finite-time bound to the probability of the optimality given as 1-O(|A|Te-T) where T is the horizon size and A is the set of the arms in the bandit.
arXiv Detail & Related papers (2020-07-29T01:54:22Z) - Stochastic batch size for adaptive regularization in deep network
optimization [63.68104397173262]
We propose a first-order optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework.
We empirically demonstrate the effectiveness of our algorithm using an image classification task based on conventional network models applied to commonly used benchmark datasets.
arXiv Detail & Related papers (2020-04-14T07:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.