Pointwise confidence estimation in the non-linear $\ell^2$-regularized least squares
- URL: http://arxiv.org/abs/2506.07088v2
- Date: Tue, 10 Jun 2025 18:59:00 GMT
- Title: Pointwise confidence estimation in the non-linear $\ell^2$-regularized least squares
- Authors: Ilja Kuzborskij, Yasin Abbasi Yadkori,
- Abstract summary: We consider a high-probability non-asymptotic confidence estimation in the $ell2$-regularized non-linear least-squares setting with fixed design.<n>We show a pointwise confidence bound, meaning that it holds for the prediction on any given fixed test input $x$.
- Score: 12.352761060862072
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider a high-probability non-asymptotic confidence estimation in the $\ell^2$-regularized non-linear least-squares setting with fixed design. In particular, we study confidence estimation for local minimizers of the regularized training loss. We show a pointwise confidence bound, meaning that it holds for the prediction on any given fixed test input $x$. Importantly, the proposed confidence bound scales with similarity of the test input to the training data in the implicit feature space of the predictor (for instance, becoming very large when the test input lies far outside of the training data). This desirable last feature is captured by the weighted norm involving the inverse-Hessian matrix of the objective function, which is a generalized version of its counterpart in the linear setting, $x^{\top} \text{Cov}^{-1} x$. Our generalized result can be regarded as a non-asymptotic counterpart of the classical confidence interval based on asymptotic normality of the MLE estimator. We propose an efficient method for computing the weighted norm, which only mildly exceeds the cost of a gradient computation of the loss function. Finally, we complement our analysis with empirical evidence showing that the proposed confidence bound provides better coverage/width trade-off compared to a confidence estimation by bootstrapping, which is a gold-standard method in many applications involving non-linear predictors such as neural networks.
Related papers
- Approximating Full Conformal Prediction for Neural Network Regression with Gauss-Newton Influence [8.952347049759094]
We construct prediction intervals for neural network regressors post-hoc without held-out data.<n>We train just once and locally perturb model parameters using Gauss-Newton influence.
arXiv Detail & Related papers (2025-07-27T13:34:32Z) - Non-Asymptotic Uncertainty Quantification in High-Dimensional Learning [5.318766629972959]
Uncertainty quantification is a crucial but challenging task in many high-dimensional regression or learning problems.
We develop a new data-driven approach for UQ in regression that applies both to classical regression approaches as well as to neural networks.
arXiv Detail & Related papers (2024-07-18T16:42:10Z) - Finite Sample Confidence Regions for Linear Regression Parameters Using
Arbitrary Predictors [1.6860963320038902]
We explore a novel methodology for constructing confidence regions for parameters of linear models, using predictions from any arbitrary predictor.
The derived confidence regions can be cast as constraints within a Mixed Linear Programming framework, enabling optimisation of linear objectives.
Unlike previous methods, the confidence region can be empty, which can be used for hypothesis testing.
arXiv Detail & Related papers (2024-01-27T00:15:48Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - Confidence Sets under Generalized Self-Concordance [2.0305676256390934]
This paper revisits a fundamental problem in statistical from a non-asymptotic theoretical viewpoint.
We establish an exponential-bound for the estimator characterizing its behavior in a non-asymptotic fashion.
An important trace of its dependency is captured by the effective dimension.
arXiv Detail & Related papers (2022-12-31T17:45:11Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - CoinDICE: Off-Policy Confidence Interval Estimation [107.86876722777535]
We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning.
We show in a variety of benchmarks that the confidence interval estimates are tighter and more accurate than existing methods.
arXiv Detail & Related papers (2020-10-22T12:39:11Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.