Related papers: Generalised Boosted Forests

Generalised Boosted Forests

URL: http://arxiv.org/abs/2102.12561v1
Date: Wed, 24 Feb 2021 21:17:31 GMT
Title: Generalised Boosted Forests
Authors: Indrayudh Ghosal, Giles Hooker
Abstract summary: We start with an MLE-type estimate in the link space and then define generalised residuals from it. We use these residuals and some corresponding weights to fit a base random forest and then repeat the same to obtain a boost random forest. We show with simulated and real data that both the random forest steps reduces test-set log-likelihood, which we treat as our primary metric.
Score: 0.9899017174990579
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper extends recent work on boosting random forests to model non-Gaussian responses. Given an exponential family $\mathbb{E}[Y|X] = g^{-1}(f(X))$ our goal is to obtain an estimate for $f$. We start with an MLE-type estimate in the link space and then define generalised residuals from it. We use these residuals and some corresponding weights to fit a base random forest and then repeat the same to obtain a boost random forest. We call the sum of these three estimators a \textit{generalised boosted forest}. We show with simulated and real data that both the random forest steps reduces test-set log-likelihood, which we treat as our primary metric. We also provide a variance estimator, which we can obtain with the same computational cost as the original estimate itself. Empirical experiments on real-world data and simulations demonstrate that the methods can effectively reduce bias, and that confidence interval coverage is conservative in the bulk of the covariate distribution.

Related papers

Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs. We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint. We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z)
Inference with Mondrian Random Forests [6.97762648094816]
We give precise bias and variance characterizations, along with a Berry-Esseen-type central limit theorem, for the Mondrian random forest regression estimator. We present valid statistical inference methods for the unknown regression function. Efficient and implementable algorithms are devised for both batch and online learning settings.
arXiv Detail & Related papers (2023-10-15T01:41:42Z)
Accelerating Generalized Random Forests with Fixed-Point Trees [2.810283834703862]
Estimators are constructed by leveraging random forests as an adaptive kernel weighting algorithm. We propose a new tree-growing rule for generalized random forests induced from a fixed-point iteration type of approximation.
arXiv Detail & Related papers (2023-06-20T21:45:35Z)
Distributional Reinforcement Learning with Dual Expectile-Quantile Regression [51.87411935256015]
quantile regression approach to distributional RL provides flexible and effective way of learning arbitrary return distributions.<n>We show that distributional estimation guarantees vanish, and we empirically observe that the estimated distribution rapidly collapses to its mean.<n>Motivated by the efficiency of $L$-based learning, we propose to jointly learn expectiles and quantiles of the return distribution in a way that allows efficient learning.
arXiv Detail & Related papers (2023-05-26T12:30:05Z)
Revealing Unobservables by Deep Learning: Generative Element Extraction Networks (GEEN) [5.3028918247347585]
This paper proposes a novel method for estimating realizations of a latent variable $X*$ in a random sample. To the best of our knowledge, this paper is the first to provide such identification in observation.
arXiv Detail & Related papers (2022-10-04T01:09:05Z)
Confidence Band Estimation for Survival Random Forests [6.343191621807365]
Survival random forest is a popular machine learning tool for modeling censored survival data. This paper proposes an unbiased confidence band estimation by extending recent developments in infinite-order incomplete U-statistics. Numerical studies show that our proposed method accurately estimates the confidence band and achieves desired coverage rate.
arXiv Detail & Related papers (2022-04-26T02:27:26Z)
Distributional Gradient Boosting Machines [77.34726150561087]
Our framework is based on XGBoost and LightGBM. We show that our framework achieves state-of-the-art forecast accuracy.
arXiv Detail & Related papers (2022-04-02T06:32:19Z)
On Variance Estimation of Random Forests [0.0]
This paper develops an unbiased variance estimator based on incomplete U-statistics. We show that our estimators enjoy lower bias and more accurate confidence interval coverage without additional computational costs.
arXiv Detail & Related papers (2022-02-18T03:35:47Z)
Unrolling Particles: Unsupervised Learning of Sampling Distributions [102.72972137287728]
Particle filtering is used to compute good nonlinear estimates of complex systems. We show in simulations that the resulting particle filter yields good estimates in a wide range of scenarios.
arXiv Detail & Related papers (2021-10-06T16:58:34Z)
Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples. We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z)
Random Planted Forest: a directly interpretable tree ensemble [0.0]
We introduce a novel interpretable tree based algorithm for prediction in a regression setting. In a simulation study we find encouraging prediction and visualisation properties of our random planted forest method.
arXiv Detail & Related papers (2020-12-29T01:51:59Z)
Distributional Reinforcement Learning via Moment Matching [54.16108052278444]
We formulate a method that learns a finite set of statistics from each return distribution via neural networks. Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target. Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines.
arXiv Detail & Related papers (2020-07-24T05:18:17Z)
The Generalized Lasso with Nonlinear Observations and Generative Priors [63.541900026673055]
We make the assumption of sub-Gaussian measurements, which is satisfied by a wide range of measurement models. We show that our result can be extended to the uniform recovery guarantee under the assumption of a so-called local embedding property.
arXiv Detail & Related papers (2020-06-22T16:43:35Z)
Showing Your Work Doesn't Always Work [73.63200097493576]
"Show Your Work: Improved Reporting of Experimental Results" advocates for reporting the expected validation effectiveness of the best-tuned model. We analytically show that their estimator is biased and uses error-prone assumptions. We derive an unbiased alternative and bolster our claims with empirical evidence from statistical simulation.
arXiv Detail & Related papers (2020-04-28T17:59:01Z)
Censored Quantile Regression Forest [81.9098291337097]
We develop a new estimating equation that adapts to censoring and leads to quantile score whenever the data do not exhibit censoring. The proposed procedure named it censored quantile regression forest, allows us to estimate quantiles of time-to-event without any parametric modeling assumption.
arXiv Detail & Related papers (2020-01-08T23:20:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.