Uncertainty in Gradient Boosting via Ensembles
- URL: http://arxiv.org/abs/2006.10562v4
- Date: Fri, 2 Apr 2021 09:00:20 GMT
- Title: Uncertainty in Gradient Boosting via Ensembles
- Authors: Andrey Malinin, Liudmila Prokhorenkova and Aleksei Ustimenko
- Abstract summary: ensembles of gradient boosting models successfully detect anomalous inputs while having limited ability to improve the predicted total uncertainty.
We propose a concept of a virtual ensemble to get the benefits of an ensemble via only one gradient boosting model, which significantly reduces complexity.
- Score: 37.808845398471874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For many practical, high-risk applications, it is essential to quantify
uncertainty in a model's predictions to avoid costly mistakes. While predictive
uncertainty is widely studied for neural networks, the topic seems to be
under-explored for models based on gradient boosting. However, gradient
boosting often achieves state-of-the-art results on tabular data. This work
examines a probabilistic ensemble-based framework for deriving uncertainty
estimates in the predictions of gradient boosting classification and regression
models. We conducted experiments on a range of synthetic and real datasets and
investigated the applicability of ensemble approaches to gradient boosting
models that are themselves ensembles of decision trees. Our analysis shows that
ensembles of gradient boosting models successfully detect anomalous inputs
while having limited ability to improve the predicted total uncertainty.
Importantly, we also propose a concept of a virtual ensemble to get the
benefits of an ensemble via only one gradient boosting model, which
significantly reduces complexity.
Related papers
- Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.
We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.
Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z) - Convergence of uncertainty estimates in Ensemble and Bayesian sparse
model discovery [4.446017969073817]
We show empirical success in terms of accuracy and robustness to noise with bootstrapping-based sequential thresholding least-squares estimator.
We show that this bootstrapping-based ensembling technique can perform a provably correct variable selection procedure with an exponential convergence rate of the error rate.
arXiv Detail & Related papers (2023-01-30T04:07:59Z) - Distributional Gradient Boosting Machines [77.34726150561087]
Our framework is based on XGBoost and LightGBM.
We show that our framework achieves state-of-the-art forecast accuracy.
arXiv Detail & Related papers (2022-04-02T06:32:19Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - A cautionary tale on fitting decision trees to data from additive
models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models.
This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data.
We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z) - Calibration and Uncertainty Quantification of Bayesian Convolutional
Neural Networks for Geophysical Applications [0.0]
It is common to incorporate the uncertainty of predictions such subsurface models should provide calibrated probabilities and the associated uncertainties in their predictions.
It has been shown that popular Deep Learning-based models are often miscalibrated, and due to their deterministic nature, provide no means to interpret the uncertainty of their predictions.
We compare three different approaches obtaining probabilistic models based on convolutional neural networks in a Bayesian formalism.
arXiv Detail & Related papers (2021-05-25T17:54:23Z) - A Generalized Stacking for Implementing Ensembles of Gradient Boosting
Machines [5.482532589225552]
An approach for constructing ensembles of gradient boosting models is proposed.
It is shown that the proposed approach can be simply extended on arbitrary differentiable combination models.
arXiv Detail & Related papers (2020-10-12T21:05:45Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - Efficient Ensemble Model Generation for Uncertainty Estimation with
Bayesian Approximation in Segmentation [74.06904875527556]
We propose a generic and efficient segmentation framework to construct ensemble segmentation models.
In the proposed method, ensemble models can be efficiently generated by using the layer selection method.
We also devise a new pixel-wise uncertainty loss, which improves the predictive performance.
arXiv Detail & Related papers (2020-05-21T16:08:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.