Related papers: On the Effect of Regularization on Nonparametric Mean-Variance Regression

On the Effect of Regularization on Nonparametric Mean-Variance Regression

URL: http://arxiv.org/abs/2511.22004v1
Date: Thu, 27 Nov 2025 01:09:28 GMT
Title: On the Effect of Regularization on Nonparametric Mean-Variance Regression
Authors: Eliot Wong-Toi, Alex Boyd, Vincent Fortuin, Stephan Mandt,
Abstract summary: We develop a statistical field theory framework, which captures the observed phase transition in alignment with experimental results.<n>Experiments on UCI datasets and the large-scale ClimSim dataset demonstrate robust calibration performance, effectively quantifying predictive uncertainty.
Score: 22.758981850171548
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Uncertainty quantification is vital for decision-making and risk assessment in machine learning. Mean-variance regression models, which predict both a mean and residual noise for each data point, provide a simple approach to uncertainty quantification. However, overparameterized mean-variance models struggle with signal-to-noise ambiguity, deciding whether prediction targets should be attributed to signal (mean) or noise (variance). At one extreme, models fit all training targets perfectly with zero residual noise, while at the other, they provide constant, uninformative predictions and explain the targets as noise. We observe a sharp phase transition between these extremes, driven by model regularization. Empirical studies with varying regularization levels illustrate this transition, revealing substantial variability across repeated runs. To explain this behavior, we develop a statistical field theory framework, which captures the observed phase transition in alignment with experimental results. This analysis reduces the regularization hyperparameter search space from two dimensions to one, significantly lowering computational costs. Experiments on UCI datasets and the large-scale ClimSim dataset demonstrate robust calibration performance, effectively quantifying predictive uncertainty.

Related papers

Practical Deep Heteroskedastic Regression [15.023152666894049]
In heteroskedastic regression, where the uncertainty of the target depends on the input, a common approach is to train a neural network that parameterizes the mean and the variance of the predictive distribution.<n>We propose a simple and efficient procedure that addresses these challenges jointly by post-hoc fitting a variance model across the intermediate layers of a pretrained network on a hold-out dataset.<n>We demonstrate that our method on-par or state-of-the-art uncertainty quantification on several molecular graph datasets, without compromising mean prediction accuracy and remaining cheap to use at prediction time.
arXiv Detail & Related papers (2026-03-02T11:19:32Z)
Effective Causal Discovery under Identifiable Heteroscedastic Noise Model [45.98718860540588]
Causal DAG learning has recently achieved promising performance in terms of both accuracy and efficiency. We propose a novel formulation for DAG learning that accounts for the variation in noise variance across variables and observations. We then propose an effective two-phase iterative DAG learning algorithm to address the increasing optimization difficulties.
arXiv Detail & Related papers (2023-12-20T08:51:58Z)
Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point. Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z)
Quantifying predictive uncertainty of aphasia severity in stroke patients with sparse heteroscedastic Bayesian high-dimensional regression [47.1405366895538]
Sparse linear regression methods for high-dimensional data commonly assume that residuals have constant variance, which can be violated in practice. This paper proposes estimating high-dimensional heteroscedastic linear regression models using a heteroscedastic partitioned empirical Bayes Expectation Conditional Maximization algorithm.
arXiv Detail & Related papers (2023-09-15T22:06:29Z)
Understanding Pathologies of Deep Heteroskedastic Regression [25.509884677111344]
Heteroskedastic models predict both mean and residual noise for each data point. At one extreme, these models fit all training data perfectly, eliminating residual noise entirely. At the other, they overfit the residual noise while predicting a constant, uninformative mean. We observe a lack of middle ground, suggesting a phase transition dependent on model regularization strength.
arXiv Detail & Related papers (2023-06-29T06:31:27Z)
Variational Imbalanced Regression: Fair Uncertainty Quantification via Probabilistic Smoothing [11.291393872745951]
Existing regression models tend to fall short in both accuracy and uncertainty estimation when the label distribution is imbalanced. We propose a probabilistic deep learning model, dubbed variational imbalanced regression (VIR) VIR performs well in imbalanced regression but naturally produces reasonable uncertainty estimation as a byproduct.
arXiv Detail & Related papers (2023-06-11T06:27:06Z)
Anomaly Detection with Variance Stabilized Density Estimation [49.46356430493534]
We present a variance-stabilized density estimation problem for maximizing the likelihood of the observed samples. To obtain a reliable anomaly detector, we introduce a spectral ensemble of autoregressive models for learning the variance-stabilized distribution. We have conducted an extensive benchmark with 52 datasets, demonstrating that our method leads to state-of-the-art results.
arXiv Detail & Related papers (2023-06-01T11:52:58Z)
Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples. We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z)
Increasing the efficiency of randomized trial estimates via linear adjustment for a prognostic score [59.75318183140857]
Estimating causal effects from randomized experiments is central to clinical research. Most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control.
arXiv Detail & Related papers (2020-12-17T21:10:10Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design. A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift. Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z)
Stable Prediction with Model Misspecification and Agnostic Distribution Shift [41.26323389341987]
In machine learning algorithms, two main assumptions are required to guarantee performance. One is that the test data are drawn from the same distribution as the training data, and the other is that the model is correctly specified. Under model misspecification, distribution shift between training and test data leads to inaccuracy of parameter estimation and instability of prediction across unknown test data. We propose a novel Decorrelated Weighting Regression (DWR) algorithm which jointly optimize a variable decorrelation regularizer and a weighted regression model.
arXiv Detail & Related papers (2020-01-31T08:56:35Z)
Maximum likelihood estimation and uncertainty quantification for Gaussian process approximation of deterministic functions [10.319367855067476]
This article provides one of the first theoretical analyses in the context of Gaussian process regression with a noiseless dataset. We show that the maximum likelihood estimation of the scale parameter alone provides significant adaptation against misspecification of the Gaussian process model.
arXiv Detail & Related papers (2020-01-29T17:20:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.