Distributional Adaptive Soft Regression Trees
- URL: http://arxiv.org/abs/2210.10389v1
- Date: Wed, 19 Oct 2022 08:59:02 GMT
- Title: Distributional Adaptive Soft Regression Trees
- Authors: Nikolaus Umlauf and Nadja Klein
- Abstract summary: This article proposes a new type of a distributional regression tree using a multivariate soft split rule.
One great advantage of the soft split is that smooth high-dimensional functions can be estimated with only one tree.
We show by means of extensive simulation studies that the algorithm has excellent properties and outperforms various benchmark methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Random forests are an ensemble method relevant for many problems, such as
regression or classification. They are popular due to their good predictive
performance (compared to, e.g., decision trees) requiring only minimal tuning
of hyperparameters. They are built via aggregation of multiple regression trees
during training and are usually calculated recursively using hard splitting
rules. Recently regression forests have been incorporated into the framework of
distributional regression, a nowadays popular regression approach aiming at
estimating complete conditional distributions rather than relating the mean of
an output variable to input features only - as done classically. This article
proposes a new type of a distributional regression tree using a multivariate
soft split rule. One great advantage of the soft split is that smooth
high-dimensional functions can be estimated with only one tree while the
complexity of the function is controlled adaptive by information criteria.
Moreover, the search for the optimal split variable is obsolete. We show by
means of extensive simulation studies that the algorithm has excellent
properties and outperforms various benchmark methods, especially in the
presence of complex non-linear feature interactions. Finally, we illustrate the
usefulness of our approach with an example on probabilistic forecasts for the
Sun's activity.
Related papers
- A Unified Approach to Extract Interpretable Rules from Tree Ensembles via Integer Programming [2.1408617023874443]
Tree ensemble methods are known for their effectiveness in supervised classification and regression tasks.
Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model.
arXiv Detail & Related papers (2024-06-30T22:33:47Z) - Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Inference with Mondrian Random Forests [6.97762648094816]
We give precise bias and variance characterizations, along with a Berry-Esseen-type central limit theorem, for the Mondrian random forest regression estimator.
We present valid statistical inference methods for the unknown regression function.
Efficient and implementable algorithms are devised for both batch and online learning settings.
arXiv Detail & Related papers (2023-10-15T01:41:42Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Optimal Sparse Regression Trees [24.03491277969824]
This work proposes a dynamic-programming-with-bounds approach to the construction of provably-optimal sparse regression trees.
We leverage a novel lower bound based on an optimal solution to the k-Means clustering algorithm in 1-dimension over the set of labels.
arXiv Detail & Related papers (2022-11-28T00:43:21Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - TreeFlow: Going beyond Tree-based Gaussian Probabilistic Regression [0.0]
We introduce TreeFlow, the tree-based approach that combines the benefits of using tree ensembles with the capabilities of modeling flexible probability distributions.
We evaluate the proposed method on challenging regression benchmarks with varying volume, feature characteristics, and target dimensionality.
arXiv Detail & Related papers (2022-06-08T20:06:23Z) - Machine Learning for Multi-Output Regression: When should a holistic
multivariate approach be preferred over separate univariate ones? [62.997667081978825]
Tree-based ensembles such as the Random Forest are modern classics among statistical learning methods.
We compare these methods in extensive simulations to help in answering the primary question when to use multivariate ensemble techniques.
arXiv Detail & Related papers (2022-01-14T08:44:25Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Multivariate Boosted Trees and Applications to Forecasting and Control [0.0]
Gradient boosted trees are non-parametric regressors that exploit sequential model fitting and gradient descent to minimize a specific loss function.
In this paper, we present a computationally efficient algorithm for fitting multivariate boosted trees.
arXiv Detail & Related papers (2020-03-08T19:26:59Z) - Adaptive Correlated Monte Carlo for Contextual Categorical Sequence
Generation [77.7420231319632]
We adapt contextual generation of categorical sequences to a policy gradient estimator, which evaluates a set of correlated Monte Carlo (MC) rollouts for variance control.
We also demonstrate the use of correlated MC rollouts for binary-tree softmax models, which reduce the high generation cost in large vocabulary scenarios.
arXiv Detail & Related papers (2019-12-31T03:01:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.