Variational Boosted Soft Trees
- URL: http://arxiv.org/abs/2302.10706v2
- Date: Wed, 22 Feb 2023 14:03:41 GMT
- Title: Variational Boosted Soft Trees
- Authors: Tristan Cinquin, Tammo Rukat, Philipp Schmidt, Martin Wistuba and
Artur Bekasov
- Abstract summary: Gradient boosting machines (GBMs) based on decision trees consistently demonstrate state-of-the-art results on regression and classification tasks.
We propose to implement Bayesian GBMs using variational inference with soft decision trees.
- Score: 13.956254007901675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gradient boosting machines (GBMs) based on decision trees consistently
demonstrate state-of-the-art results on regression and classification tasks
with tabular data, often outperforming deep neural networks. However, these
models do not provide well-calibrated predictive uncertainties, which prevents
their use for decision making in high-risk applications. The Bayesian treatment
is known to improve predictive uncertainty calibration, but previously proposed
Bayesian GBM methods are either computationally expensive, or resort to crude
approximations. Variational inference is often used to implement Bayesian
neural networks, but is difficult to apply to GBMs, because the decision trees
used as weak learners are non-differentiable. In this paper, we propose to
implement Bayesian GBMs using variational inference with soft decision trees, a
fully differentiable alternative to standard decision trees introduced by Irsoy
et al. Our experiments demonstrate that variational soft trees and variational
soft GBMs provide useful uncertainty estimates, while retaining good predictive
performance. The proposed models show higher test likelihoods when compared to
the state-of-the-art Bayesian GBMs in 7/10 tabular regression datasets and
improved out-of-distribution detection in 5/10 datasets.
Related papers
- Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - On Uncertainty Estimation by Tree-based Surrogate Models in Sequential
Model-based Optimization [13.52611859628841]
We revisit various ensembles of randomized trees to investigate their behavior in the perspective of prediction uncertainty estimation.
We propose a new way of constructing an ensemble of randomized trees, referred to as BwO forest, where bagging with oversampling is employed to construct bootstrapped samples.
Experimental results demonstrate the validity and good performance of BwO forest over existing tree-based models in various circumstances.
arXiv Detail & Related papers (2022-02-22T04:50:37Z) - Feature Importance in Gradient Boosting Trees with Cross-Validation
Feature Selection [11.295032417617454]
We study the effect of biased base learners on Gradient Boosting Machines (GBM) feature importance (FI) measures.
By utilizing cross-validated (CV) unbiased base learners, we fix this flaw at a relatively low computational cost.
We demonstrate the suggested framework in a variety of synthetic and real-world setups, showing a significant improvement in all GBM FI measures while maintaining relatively the same level of prediction accuracy.
arXiv Detail & Related papers (2021-09-12T09:32:43Z) - Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic
Regression [51.770998056563094]
Probabilistic Gradient Boosting Machines (PGBM) is a method to create probabilistic predictions with a single ensemble of decision trees.
We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-03T08:32:13Z) - Calibration and Uncertainty Quantification of Bayesian Convolutional
Neural Networks for Geophysical Applications [0.0]
It is common to incorporate the uncertainty of predictions such subsurface models should provide calibrated probabilities and the associated uncertainties in their predictions.
It has been shown that popular Deep Learning-based models are often miscalibrated, and due to their deterministic nature, provide no means to interpret the uncertainty of their predictions.
We compare three different approaches obtaining probabilistic models based on convolutional neural networks in a Bayesian formalism.
arXiv Detail & Related papers (2021-05-25T17:54:23Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Nonparametric Variable Screening with Optimal Decision Stumps [19.493449206135296]
We derive finite sample performance guarantees for variable selection in nonparametric models using a single-level CART decision tree.
Unlike previous marginal screening methods that attempt to directly estimate each marginal projection via a truncated basis expansion, the fitted model used here is a simple, parsimonious decision stump.
arXiv Detail & Related papers (2020-11-05T06:56:12Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks [65.24701908364383]
We show that a sufficient condition for a uncertainty on a ReLU network is "to be a bit Bayesian calibrated"
We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.
arXiv Detail & Related papers (2020-02-24T08:52:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.