Related papers: Generalisation and benign over-fitting for linear regression onto random functional covariates

Generalisation and benign over-fitting for linear regression onto random functional covariates

URL: http://arxiv.org/abs/2508.13895v1
Date: Tue, 19 Aug 2025 15:01:20 GMT
Title: Generalisation and benign over-fitting for linear regression onto random functional covariates
Authors: Andrew Jones, Nick Whiteley,
Abstract summary: We study theoretical predictive performance of ridge and ridge-less least-squares regression.<n>We derive convergence rates in regimes where $p$ grows fast relative to $n$.
Score: 4.344644788142548
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study theoretical predictive performance of ridge and ridge-less least-squares regression when covariate vectors arise from evaluating $p$ random, means-square continuous functions over a latent metric space at $n$ random and unobserved locations, subject to additive noise. This leads us away from the standard assumption of i.i.d. data to a setting in which the $n$ covariate vectors are exchangeable but not independent in general. Under an assumption of independence across dimensions, $4$-th order moment, and other regularity conditions, we obtain probabilistic bounds on a notion of predictive excess risk adapted to our random functional covariate setting, making use of recent results of Barzilai and Shamir. We derive convergence rates in regimes where $p$ grows suitably fast relative to $n$, illustrating interplay between ingredients of the model in determining convergence behaviour and the role of additive covariate noise in benign-overfitting.

Related papers

Optimal Unconstrained Self-Distillation in Ridge Regression: Strict Improvements, Precise Asymptotics, and One-Shot Tuning [61.07540493350384]
Self-distillation (SD) is the process of retraining a student on a mixture of ground-truth and the teacher's own predictions.<n>We show that for any prediction risk, the optimally mixed student improves upon the ridge teacher for every regularization level.<n>We propose a consistent one-shot tuning method to estimate $star$ without grid search, sample splitting, or refitting.
arXiv Detail & Related papers (2026-02-19T17:21:15Z)
Statistical Inference under Adaptive Sampling with LinUCB [15.167069362020426]
We show that the linear upper confidence bound (LinUCB) algorithm for linear bandits satisfies a property called stability.<n>We establish a central limit theorem for the LinUCB algorithm, establishing normality for the limiting distribution of the estimation error.
arXiv Detail & Related papers (2025-11-28T21:48:18Z)
Multivariate root-n-consistent smoothing parameter free matching estimators and estimators of inverse density weighted expectations [51.000851088730684]
We develop novel modifications of nearest-neighbor and matching estimators which converge at the parametric $sqrt n $-rate.<n>We stress that our estimators do not involve nonparametric function estimators and in particular do not rely on sample-size dependent parameters smoothing.
arXiv Detail & Related papers (2024-07-11T13:28:34Z)
Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.<n>We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.<n>We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z)
Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models [4.675899216825188]
We propose a notion of simultaneous confidence intervals called the sparsified simultaneous confidence intervals.<n>Our intervals are sparse in the sense that some of the intervals' upper and lower bounds are shrunken to zero.<n>The proposed method can be coupled with various selection procedures, making it ideal for comparing their uncertainty.
arXiv Detail & Related papers (2023-07-14T18:37:57Z)
Generalized equivalences between subsampling and ridge regularization [3.1346887720803505]
We prove structural and risk equivalences between subsampling and ridge regularization for ensemble ridge estimators. An indirect implication of our equivalences is that optimally tuned ridge regression exhibits a monotonic prediction risk in the data aspect ratio.
arXiv Detail & Related papers (2023-05-29T14:05:51Z)
Counterfactual inference in sequential experiments [17.817769460838665]
We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points.<n>Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale.<n>We illustrate our theory via several simulations and a case study involving data from a mobile health clinical trial HeartSteps.
arXiv Detail & Related papers (2022-02-14T17:24:27Z)
Robust Linear Regression for General Feature Distribution [21.0709900887309]
We investigate robust linear regression where data may be contaminated by an oblivious adversary. We do not necessarily assume that the features are centered. If the features are centered we can obtain a standard convergence rate.
arXiv Detail & Related papers (2022-02-04T11:22:13Z)
On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD) We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting. We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z)
Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution. Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
Distribution-Free Robust Linear Regression [5.532477732693]
We study random design linear regression with no assumptions on the distribution of the covariates. We construct a non-linear estimator achieving excess risk of order $d/n$ with the optimal sub-exponential tail. We prove an optimal version of the classical bound for the truncated least squares estimator due to Gy"orfi, Kohler, Krzyzak, and Walk.
arXiv Detail & Related papers (2021-02-25T15:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.