Machine Learning for Multi-Output Regression: When should a holistic
multivariate approach be preferred over separate univariate ones?
- URL: http://arxiv.org/abs/2201.05340v1
- Date: Fri, 14 Jan 2022 08:44:25 GMT
- Title: Machine Learning for Multi-Output Regression: When should a holistic
multivariate approach be preferred over separate univariate ones?
- Authors: Lena Schmid, Alexander Gerharz, Andreas Groll and Markus Pauly
- Abstract summary: Tree-based ensembles such as the Random Forest are modern classics among statistical learning methods.
We compare these methods in extensive simulations to help in answering the primary question when to use multivariate ensemble techniques.
- Score: 62.997667081978825
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Tree-based ensembles such as the Random Forest are modern classics among
statistical learning methods. In particular, they are used for predicting
univariate responses. In case of multiple outputs the question arises whether
we separately fit univariate models or directly follow a multivariate approach.
For the latter, several possibilities exist that are, e.g. based on modified
splitting or stopping rules for multi-output regression. In this work we
compare these methods in extensive simulations to help in answering the primary
question when to use multivariate ensemble techniques.
Related papers
- Structured Estimation of Heterogeneous Time Series [5.102931012520635]
How best to model structurally heterogeneous processes is a foundational question in the social, health and behavioral sciences.
Recently, Fisher et al., (2022) introduced the multi- VAR approach for simultaneously estimating multiple-subject multivariate time series.
This approach differs from many popular modeling approaches for multiple-subject time series in that qualitative and quantitative differences in a large number of individual dynamics are well-accommodated.
We extend the multi- VAR framework to include new adaptive weighting schemes that greatly improve estimation performance.
arXiv Detail & Related papers (2023-11-15T02:39:13Z) - Distributional Adaptive Soft Regression Trees [0.0]
This article proposes a new type of a distributional regression tree using a multivariate soft split rule.
One great advantage of the soft split is that smooth high-dimensional functions can be estimated with only one tree.
We show by means of extensive simulation studies that the algorithm has excellent properties and outperforms various benchmark methods.
arXiv Detail & Related papers (2022-10-19T08:59:02Z) - Generative machine learning methods for multivariate ensemble
post-processing [2.266704492832475]
We present a novel class of nonparametric data-driven distributional regression models based on generative machine learning.
In two case studies, our generative model shows significant improvements over state-of-the-art methods.
arXiv Detail & Related papers (2022-09-26T09:02:30Z) - An Application of a Multivariate Estimation of Distribution Algorithm to
Cancer Chemotherapy [59.40521061783166]
Chemotherapy treatment for cancer is a complex optimisation problem with a large number of interacting variables and constraints.
We show that the more sophisticated algorithm would yield better performance on a complex problem like this.
We hypothesise that this is caused by the more sophisticated algorithm being impeded by the large number of interactions in the problem.
arXiv Detail & Related papers (2022-05-17T15:28:46Z) - Mixtures of Gaussian Processes for regression under multiple prior
distributions [0.0]
We extend the idea of Mixture models for Gaussian Process regression in order to work with multiple prior beliefs at once.
We consider the usage of our approach to additionally account for the problem of prior misspecification in functional regression problems.
arXiv Detail & Related papers (2021-04-19T10:19:14Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Conditional Generative Modeling via Learning the Latent Space [54.620761775441046]
We propose a novel framework for conditional generation in multimodal spaces.
It uses latent variables to model generalizable learning patterns.
At inference, the latent variables are optimized to find optimal solutions corresponding to multiple output modes.
arXiv Detail & Related papers (2020-10-07T03:11:34Z) - Multivariable times series classification through an interpretable
representation [0.0]
We propose a time series classification method that considers an alternative representation of time series through a set of descriptive features.
We have applied traditional classification algorithms obtaining interpretable and competitive results.
arXiv Detail & Related papers (2020-09-08T09:44:03Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.