Related papers: Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression

Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression

URL: http://arxiv.org/abs/2005.14458v3
Date: Wed, 12 Oct 2022 08:34:46 GMT
Title: Distributional Random Forests: Heterogeneity Adjustment and Multivariate Distributional Regression
Authors: Domagoj \'Cevid, Loris Michel, Jeffrey N\"af, Nicolai Meinshausen, Peter B\"uhlmann
Abstract summary: We propose a novel forest construction for multivariate responses based on their joint conditional distribution. The code is available as Python and R packages drf.
Score: 0.8574682463936005
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Random Forest (Breiman, 2001) is a successful and widely used regression and classification algorithm. Part of its appeal and reason for its versatility is its (implicit) construction of a kernel-type weighting function on training data, which can also be used for targets other than the original mean estimation. We propose a novel forest construction for multivariate responses based on their joint conditional distribution, independent of the estimation target and the data model. It uses a new splitting criterion based on the MMD distributional metric, which is suitable for detecting heterogeneity in multivariate distributions. The induced weights define an estimate of the full conditional distribution, which in turn can be used for arbitrary and potentially complicated targets of interest. The method is very versatile and convenient to use, as we illustrate on a wide range of examples. The code is available as Python and R packages drf.

Related papers

Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables. We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure. We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z)
Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data [2.6499018693213316]
We introduce a novel generative model for the representation of joint probability distributions of discrete random variables. The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions.
arXiv Detail & Related papers (2024-06-06T21:58:33Z)
Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs. We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint. We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z)
Inference with Mondrian Random Forests [6.97762648094816]
We give precise bias and variance characterizations, along with a Berry-Esseen-type central limit theorem, for the Mondrian random forest regression estimator. We present valid statistical inference methods for the unknown regression function. Efficient and implementable algorithms are devised for both batch and online learning settings.
arXiv Detail & Related papers (2023-10-15T01:41:42Z)
Variational autoencoder with weighted samples for high-dimensional non-parametric adaptive importance sampling [0.0]
We extend the existing framework to the case of weighted samples by introducing a new objective function. In order to add flexibility to the model and to be able to learn multimodal distributions, we consider a learnable prior distribution. We exploit the proposed procedure in existing adaptive importance sampling algorithms to draw points from a target distribution and to estimate a rare event probability in high dimension.
arXiv Detail & Related papers (2023-10-13T15:40:55Z)
Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions. We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z)
Machine Learning for Multi-Output Regression: When should a holistic multivariate approach be preferred over separate univariate ones? [62.997667081978825]
Tree-based ensembles such as the Random Forest are modern classics among statistical learning methods. We compare these methods in extensive simulations to help in answering the primary question when to use multivariate ensemble techniques.
arXiv Detail & Related papers (2022-01-14T08:44:25Z)
Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution. Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z)
Probabilistic Kolmogorov-Arnold Network [1.4732811715354455]
The present paper proposes a method for estimating probability distributions of the outputs in the case of aleatoric uncertainty. The suggested approach covers input-dependent probability distributions of the outputs, as well as the variation of the distribution type with the inputs. Although the method is applicable to any regression model, the present paper combines it with KANs, since the specific structure of KANs leads to computationally-efficient models' construction.
arXiv Detail & Related papers (2021-04-04T23:49:15Z)
Nonlinear Distribution Regression for Remote Sensing Applications [6.664736150040092]
In many remote sensing applications one wants to estimate variables or parameters of interest from observations. Standard algorithms such as neural networks, random forests or Gaussian processes are readily available to relate to the two. This paper introduces a nonlinear (kernel-based) method for distribution regression that solves the previous problems without making any assumption on the statistics of the grouped data.
arXiv Detail & Related papers (2020-12-07T22:04:43Z)
An Embedded Model Estimator for Non-Stationary Random Functions using Multiple Secondary Variables [0.0]
This paper introduces the method and shows that it has consistency results that are similar in nature to those applying to geostatistical modelling and to Quantile Random Forests. The algorithm works by estimating a conditional distribution for the target variable at each target location.
arXiv Detail & Related papers (2020-11-09T00:14:24Z)
Distributional Reinforcement Learning via Moment Matching [54.16108052278444]
We formulate a method that learns a finite set of statistics from each return distribution via neural networks. Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target. Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines.
arXiv Detail & Related papers (2020-07-24T05:18:17Z)
Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions. Motivated by these theoretical results, we propose learning several approximate proposals for the best model. In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.