Medoid splits for efficient random forests in metric spaces
- URL: http://arxiv.org/abs/2306.17031v1
- Date: Thu, 29 Jun 2023 15:32:11 GMT
- Title: Medoid splits for efficient random forests in metric spaces
- Authors: Matthieu Bult\'e and Helle S{\o}rensen
- Abstract summary: This paper revisits an adaptation of the random forest for Fr'echet regression, addressing the challenge of regression in metric spaces.
We introduce a new splitting rule that circumvents the computationally expensive operation of Fr'echet means by substituting with a medoid-based approach.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper revisits an adaptation of the random forest algorithm for
Fr\'echet regression, addressing the challenge of regression in the context of
random objects in metric spaces. Recognizing the limitations of previous
approaches, we introduce a new splitting rule that circumvents the
computationally expensive operation of Fr\'echet means by substituting with a
medoid-based approach. We validate this approach by demonstrating its
asymptotic equivalence to Fr\'echet mean-based procedures and establish the
consistency of the associated regression estimator. The paper provides a sound
theoretical framework and a more efficient computational approach to Fr\'echet
regression, broadening its application to non-standard data types and complex
use cases.
Related papers
- Progression: an extrapolation principle for regression [0.0]
We propose a novel statistical extrapolation principle.
It assumes a simple relationship between predictors and the response at the boundary of the training predictor samples.
Our semi-parametric method, progression, leverages this extrapolation principle and offers guarantees on the approximation error beyond the training data range.
arXiv Detail & Related papers (2024-10-30T17:29:51Z) - Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery [0.0]
We focus on distributed estimation and support recovery for high-dimensional linear quantile regression.
We transform the original quantile regression into the least-squares optimization.
An efficient algorithm is developed, which enjoys high computation and communication efficiency.
arXiv Detail & Related papers (2024-05-13T08:32:22Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - Refining Amortized Posterior Approximations using Gradient-Based Summary
Statistics [0.9176056742068814]
We present an iterative framework to improve the amortized approximations of posterior distributions in the context of inverse problems.
We validate our method in a controlled setting by applying it to a stylized problem, and observe improved posterior approximations with each iteration.
arXiv Detail & Related papers (2023-05-15T15:47:19Z) - Risk Consistent Multi-Class Learning from Label Proportions [64.0125322353281]
This study addresses a multiclass learning from label proportions (MCLLP) setting in which training instances are provided in bags.
Most existing MCLLP methods impose bag-wise constraints on the prediction of instances or assign them pseudo-labels.
A risk-consistent method is proposed for instance classification using the empirical risk minimization framework.
arXiv Detail & Related papers (2022-03-24T03:49:04Z) - Random Forest Weighted Local Fréchet Regression with Random Objects [18.128663071848923]
We propose a novel random forest weighted local Fr'echet regression paradigm.
Our first method uses these weights as the local average to solve the conditional Fr'echet mean.
Second method performs local linear Fr'echet regression, both significantly improving existing Fr'echet regression methods.
arXiv Detail & Related papers (2022-02-10T09:10:59Z) - Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm.
Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z) - Optimal variance-reduced stochastic approximation in Banach spaces [114.8734960258221]
We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space.
We establish non-asymptotic bounds for both the operator defect and the estimation error.
arXiv Detail & Related papers (2022-01-21T02:46:57Z) - Communication-Efficient Distributed Quantile Regression with Optimal
Statistical Guarantees [2.064612766965483]
We address the problem of how to achieve optimal inference in distributed quantile regression without stringent scaling conditions.
The difficulties are resolved through a double-smoothing approach that is applied to the local (at each data source) and global objective functions.
Despite the reliance on a delicate combination of local and global smoothing parameters, the quantile regression model is fully parametric.
arXiv Detail & Related papers (2021-10-25T17:09:59Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.