Related papers: WildWood: a new Random Forest algorithm

WildWood: a new Random Forest algorithm

URL: http://arxiv.org/abs/2109.08010v2
Date: Tue, 13 Jun 2023 09:57:23 GMT
Title: WildWood: a new Random Forest algorithm
Authors: St\'ephane Ga\"iffas and Ibrahim Merad and Yiyang Yu
Abstract summary: WildWood is a new ensemble algorithm for supervised learning of Random Forest (RF) type. WW uses bootstrap out-of-bag samples to compute out-of-bag scores. WW is fast and competitive compared with other well-established ensemble methods.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce WildWood (WW), a new ensemble algorithm for supervised learning of Random Forest (RF) type. While standard RF algorithms use bootstrap out-of-bag samples to compute out-of-bag scores, WW uses these samples to produce improved predictions given by an aggregation of the predictions of all possible subtrees of each fully grown tree in the forest. This is achieved by aggregation with exponential weights computed over out-of-bag samples, that are computed exactly and very efficiently thanks to an algorithm called context tree weighting. This improvement, combined with a histogram strategy to accelerate split finding, makes WW fast and competitive compared with other well-established ensemble methods, such as standard RF and extreme gradient boosting algorithms.

Related papers

iBRF: Improved Balanced Random Forest Classifier [0.0]
Class imbalance poses a major challenge in different classification tasks. We propose a modification to the Balanced Random Forest (BRF) classifier to enhance the prediction performance. Our proposed hybrid sampling technique, when incorporated into the framework of the Random Forest classifier, achieves better prediction performance than other sampling techniques used in imbalanced classification tasks.
arXiv Detail & Related papers (2024-03-14T20:59:36Z)
Optimal Weighted Random Forests [8.962539518822684]
The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy. We propose two optimal algorithms, namely the 1 Step Weighted RF (1step-WRF$_mathrmopt$) and 2 Steps Optimal Weighted RF (2steps-WRF$_mathrmopt$) Numerical studies conducted on real-world data sets indicate that these algorithms outperform the equal-weight forest and two other weighted RFs proposed in existing literature.
arXiv Detail & Related papers (2023-05-17T08:36:43Z)
MABSplit: Faster Forest Training Using Multi-Armed Bandits [14.650858573237977]
We present an algorithm that accelerates the training of random forests and other tree-based learning methods. At the core of our algorithm is a novel node-splitting subroutine, dubbed MABSplit, used to efficiently find split points. Our algorithm borrows techniques from the multi-armed bandit literature to judiciously determine how to allocate samples.
arXiv Detail & Related papers (2022-12-14T19:43:43Z)
Layer Ensembles [95.42181254494287]
We introduce a method for uncertainty estimation that considers a set of independent categorical distributions for each layer of the network. We show that the method can be further improved by ranking samples, resulting in models that require less memory and time to run.
arXiv Detail & Related papers (2022-10-10T17:52:47Z)
Forecasting the Short-Term Energy Consumption Using Random Forests and Gradient Boosting [0.0]
This paper analyzes comparatively the performance of Random Forests and Gradient Boosting algorithms in the field of forecasting the energy consumption based on historical data. The two algorithms are applied in order to forecast the energy consumption individually, and then combined together by using a Weighted Average Ensemble Method.
arXiv Detail & Related papers (2022-07-25T07:40:25Z)
Langevin Monte Carlo for Contextual Bandits [72.00524614312002]
Langevin Monte Carlo Thompson Sampling (LMC-TS) is proposed to directly sample from the posterior distribution in contextual bandits. We prove that the proposed algorithm achieves the same sublinear regret bound as the best Thompson sampling algorithms for a special case of contextual bandits.
arXiv Detail & Related papers (2022-06-22T17:58:23Z)
Random-reshuffled SARAH does not need a full gradient computations [61.85897464405715]
The StochAstic Recursive grAdientritHm (SARAH) algorithm is a variance reduced variant of the Gradient Descent (SGD) algorithm. In this paper, we remove the necessity of a full gradient. The aggregated gradients serve as an estimate of a full gradient in the SARAH algorithm.
arXiv Detail & Related papers (2021-11-26T06:00:44Z)
Plug-And-Play Learned Gaussian-mixture Approximate Message Passing [71.74028918819046]
We propose a plug-and-play compressed sensing (CS) recovery algorithm suitable for any i.i.d. source prior. Our algorithm builds upon Borgerding's learned AMP (LAMP), yet significantly improves it by adopting a universal denoising function within the algorithm. Numerical evaluation shows that the L-GM-AMP algorithm achieves state-of-the-art performance without any knowledge of the source prior.
arXiv Detail & Related papers (2020-11-18T16:40:45Z)
Efficient Computation of Expectations under Spanning Tree Distributions [67.71280539312536]
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models. Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms.
arXiv Detail & Related papers (2020-08-29T14:58:26Z)
Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs) These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z)
Stochastic tree ensembles for regularized nonlinear regression [0.913755431537592]
This paper develops a novel tree ensemble method for nonlinear regression, which we refer to as XBART. By combining regularization and search strategies from Bayesian modeling with computationally efficient techniques, the new method attains state-of-the-art performance.
arXiv Detail & Related papers (2020-02-09T14:37:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.