Related papers: Stochastic tree ensembles for regularized nonlinear regression

Stochastic tree ensembles for regularized nonlinear regression

URL: http://arxiv.org/abs/2002.03375v4
Date: Thu, 3 Jun 2021 14:44:02 GMT
Title: Stochastic tree ensembles for regularized nonlinear regression
Authors: Jingyu He, P. Richard Hahn
Abstract summary: This paper develops a novel tree ensemble method for nonlinear regression, which we refer to as XBART. By combining regularization and search strategies from Bayesian modeling with computationally efficient techniques, the new method attains state-of-the-art performance.
Score: 0.913755431537592
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper develops a novel stochastic tree ensemble method for nonlinear regression, which we refer to as XBART, short for Accelerated Bayesian Additive Regression Trees. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning approaches, the new method attains state-of-the-art performance: in many settings it is both faster and more accurate than the widely-used XGBoost algorithm. Via careful simulation studies, we demonstrate that our new approach provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost and neural networks (using Keras). We also prove a number of basic theoretical results about the new algorithm, including consistency of the single tree version of the model and stationarity of the Markov chain produced by the ensemble version. Furthermore, we demonstrate that initializing standard Bayesian additive regression trees Markov chain Monte Carlo (MCMC) at XBART-fitted trees considerably improves credible interval coverage and reduces total run-time.

Related papers

TreeRPO: Tree Relative Policy Optimization [55.97385410074841]
name is a novel method that estimates the mathematical expectations of rewards at various reasoning steps using tree sampling.<n>Building on the group-relative reward training mechanism of GRPO, name innovatively computes rewards based on step-level groups generated during tree sampling.
arXiv Detail & Related papers (2025-06-05T15:56:38Z)
Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
Birch SGD: A Tree Graph Framework for Local and Asynchronous SGD Methods [51.54704494242525]
We propose a new unifying framework, Birch SGD, for analyzing and designing distributed SGD methods.<n>Using Birch SGD, we design eight new methods and analyze them alongside previously known ones, with at least six of the new methods shown to have optimal computational time complexity.<n>Our research leads to two key insights: (i) all methods share the same "iteration rate" of $Oleft(frac(R + 1) L Deltavarepsilon + fracsigma2 L Deltavarepsilon2right)$, where $R$
arXiv Detail & Related papers (2025-05-14T08:37:45Z)
Soft regression trees: a model variant and a decomposition training algorithm [0.24578723416255752]
We propose a new variant of soft multivariate regression trees (SRTs) where, for every input vector, the prediction is defined as a linear regression associated to a single leaf node. SRTs exhibit the conditional computational property, i.e., each prediction depends on a small number of nodes. Experiments on 15 wellknown datasets indicate that our SRTs and decomposition algorithm yield higher accuracy and robustness compared with traditional soft regression trees.
arXiv Detail & Related papers (2025-01-10T13:06:36Z)
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding Networks [65.34977803841007]
Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience. We show how by simply changing the temporal scheduling of the update rule for the synaptic weights leads to an algorithm that is much more efficient and stable than the original one.
arXiv Detail & Related papers (2022-11-16T00:11:04Z)
Distributional Adaptive Soft Regression Trees [0.0]
This article proposes a new type of a distributional regression tree using a multivariate soft split rule. One great advantage of the soft split is that smooth high-dimensional functions can be estimated with only one tree. We show by means of extensive simulation studies that the algorithm has excellent properties and outperforms various benchmark methods.
arXiv Detail & Related papers (2022-10-19T08:59:02Z)
When to Update Your Model: Constrained Model-based Reinforcement Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL) Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
GP-BART: a novel Bayesian additive regression trees approach using Gaussian processes [1.03590082373586]
The GP-BART model is an extension of BART which addresses the limitation by assuming GP priors for the predictions of each terminal node among all trees. The model's effectiveness is demonstrated through applications to simulated and real-world data, surpassing the performance of traditional modeling approaches in various scenarios.
arXiv Detail & Related papers (2022-04-05T11:18:44Z)
Generalized Bayesian Additive Regression Trees Models: Beyond Conditional Conjugacy [2.969705152497174]
In this article, we greatly expand the domain of applicability of BART to arbitrary emphgeneralized BART models. Our algorithm requires only that the user be able to compute the likelihood and (optionally) its gradient and Fisher information. The potential applications are very broad; we consider examples in survival analysis, structured heteroskedastic regression, and gamma shape regression.
arXiv Detail & Related papers (2022-02-20T22:52:07Z)
A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models. This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data. We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z)
Unfolding Projection-free SDP Relaxation of Binary Graph Classifier via GDPA Linearization [59.87663954467815]
Algorithm unfolding creates an interpretable and parsimonious neural network architecture by implementing each iteration of a model-based algorithm as a neural layer. In this paper, leveraging a recent linear algebraic theorem called Gershgorin disc perfect alignment (GDPA), we unroll a projection-free algorithm for semi-definite programming relaxation (SDR) of a binary graph. Experimental results show that our unrolled network outperformed pure model-based graph classifiers, and achieved comparable performance to pure data-driven networks but using far fewer parameters.
arXiv Detail & Related papers (2021-09-10T07:01:15Z)
Relational Boosted Regression Trees [1.14179290793997]
Many tasks use data housed in databases to train boosted regression tree models. We give an adaptation of the greedyimation algorithm for training boosted regression trees.
arXiv Detail & Related papers (2021-07-25T20:29:28Z)
Gradient Boosted Binary Histogram Ensemble for Large-scale Regression [60.16351608335641]
We propose a gradient boosting algorithm for large-scale regression problems called textitGradient Boosted Binary Histogram Ensemble (GBBHE) based on binary histogram partition and ensemble learning. In the experiments, compared with other state-of-the-art algorithms such as gradient boosted regression tree (GBRT), our GBBHE algorithm shows promising performance with less running time on large-scale datasets.
arXiv Detail & Related papers (2021-06-03T17:05:40Z)
Improved Branch and Bound for Neural Network Verification via Lagrangian Decomposition [161.09660864941603]
We improve the scalability of Branch and Bound (BaB) algorithms for formally proving input-output properties of neural networks. We present a novel activation-based branching strategy and a BaB framework, named Branch and Dual Network Bound (BaDNB) BaDNB outperforms previous complete verification systems by a large margin, cutting average verification times by factors up to 50 on adversarial properties.
arXiv Detail & Related papers (2021-04-14T09:22:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.