WildWood: a new Random Forest algorithm
- URL: http://arxiv.org/abs/2109.08010v2
- Date: Tue, 13 Jun 2023 09:57:23 GMT
- Title: WildWood: a new Random Forest algorithm
- Authors: St\'ephane Ga\"iffas and Ibrahim Merad and Yiyang Yu
- Abstract summary: WildWood is a new ensemble algorithm for supervised learning of Random Forest (RF) type.
WW uses bootstrap out-of-bag samples to compute out-of-bag scores.
WW is fast and competitive compared with other well-established ensemble methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce WildWood (WW), a new ensemble algorithm for supervised learning
of Random Forest (RF) type. While standard RF algorithms use bootstrap
out-of-bag samples to compute out-of-bag scores, WW uses these samples to
produce improved predictions given by an aggregation of the predictions of all
possible subtrees of each fully grown tree in the forest. This is achieved by
aggregation with exponential weights computed over out-of-bag samples, that are
computed exactly and very efficiently thanks to an algorithm called context
tree weighting. This improvement, combined with a histogram strategy to
accelerate split finding, makes WW fast and competitive compared with other
well-established ensemble methods, such as standard RF and extreme gradient
boosting algorithms.
Related papers
- iBRF: Improved Balanced Random Forest Classifier [0.0]
Class imbalance poses a major challenge in different classification tasks.
We propose a modification to the Balanced Random Forest (BRF) classifier to enhance the prediction performance.
Our proposed hybrid sampling technique, when incorporated into the framework of the Random Forest classifier, achieves better prediction performance than other sampling techniques used in imbalanced classification tasks.
arXiv Detail & Related papers (2024-03-14T20:59:36Z) - Optimal Weighted Random Forests [8.962539518822684]
The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy.
We propose two optimal algorithms, namely the 1 Step Weighted RF (1step-WRF$_mathrmopt$) and 2 Steps Optimal Weighted RF (2steps-WRF$_mathrmopt$)
Numerical studies conducted on real-world data sets indicate that these algorithms outperform the equal-weight forest and two other weighted RFs proposed in existing literature.
arXiv Detail & Related papers (2023-05-17T08:36:43Z) - MABSplit: Faster Forest Training Using Multi-Armed Bandits [14.650858573237977]
We present an algorithm that accelerates the training of random forests and other tree-based learning methods.
At the core of our algorithm is a novel node-splitting subroutine, dubbed MABSplit, used to efficiently find split points.
Our algorithm borrows techniques from the multi-armed bandit literature to judiciously determine how to allocate samples.
arXiv Detail & Related papers (2022-12-14T19:43:43Z) - Layer Ensembles [95.42181254494287]
We introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network.
We show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run.
arXiv Detail & Related papers (2022-10-10T17:52:47Z) - Forecasting the Short-Term Energy Consumption Using Random Forests and
Gradient Boosting [0.0]
This paper analyzes comparatively the performance of Random Forests and Gradient Boosting algorithms in the field of forecasting the energy consumption based on historical data.
The two algorithms are applied in order to forecast the energy consumption individually, and then combined together by using a Weighted Average Ensemble Method.
arXiv Detail & Related papers (2022-07-25T07:40:25Z) - Langevin Monte Carlo for Contextual Bandits [72.00524614312002]
Langevin Monte Carlo Thompson Sampling (LMC-TS) is proposed to directly sample from the posterior distribution in contextual bandits.
We prove that the proposed algorithm achieves the same sublinear regret bound as the best Thompson sampling algorithms for a special case of contextual bandits.
arXiv Detail & Related papers (2022-06-22T17:58:23Z) - Random-reshuffled SARAH does not need a full gradient computations [61.85897464405715]
The StochAstic Recursive grAdientritHm (SARAH) algorithm is a variance reduced variant of the Gradient Descent (SGD) algorithm.
In this paper, we remove the necessity of a full gradient.
The aggregated gradients serve as an estimate of a full gradient in the SARAH algorithm.
arXiv Detail & Related papers (2021-11-26T06:00:44Z) - Plug-And-Play Learned Gaussian-mixture Approximate Message Passing [71.74028918819046]
We propose a plug-and-play compressed sensing (CS) recovery algorithm suitable for any i.i.d. source prior.
Our algorithm builds upon Borgerding's learned AMP (LAMP), yet significantly improves it by adopting a universal denoising function within the algorithm.
Numerical evaluation shows that the L-GM-AMP algorithm achieves state-of-the-art performance without any knowledge of the source prior.
arXiv Detail & Related papers (2020-11-18T16:40:45Z) - Efficient Computation of Expectations under Spanning Tree Distributions [67.71280539312536]
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models.
Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms.
arXiv Detail & Related papers (2020-08-29T14:58:26Z) - Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs)
These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z) - Stochastic tree ensembles for regularized nonlinear regression [0.913755431537592]
This paper develops a novel tree ensemble method for nonlinear regression, which we refer to as XBART.
By combining regularization and search strategies from Bayesian modeling with computationally efficient techniques, the new method attains state-of-the-art performance.
arXiv Detail & Related papers (2020-02-09T14:37:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.