On Regression in Extreme Regions
- URL: http://arxiv.org/abs/2303.03084v3
- Date: Fri, 12 Sep 2025 14:44:04 GMT
- Title: On Regression in Extreme Regions
- Authors: Stephan Clémençon, Nathan Huet, Anne Sabourin,
- Abstract summary: We establish a statistical learning theoretical framework aimed at extrapolation, or out-of-domain generalization, on the unobserved tails of covariates.<n>We address the stylized problem of nonparametric least squares regression with predictors chosen from a Vapnik-Chervonenkis class.<n>We quantify the predictive performance on tail regions in terms of excess risk, presenting it as a finite sample risk bound with a clear bias-variance decomposition.
- Score: 0.7974430263940756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We establish a statistical learning theoretical framework aimed at extrapolation, or out-of-domain generalization, on the unobserved tails of covariates in continuous regression problems. Our strategy involves performing statistical regression on a subsample of observations with continuous labels that are the furthest away from the origin, focusing specifically on their angular components. The underlying assumptions of our approach are grounded in the theory of multivariate regular variation, a cornerstone of extreme value theory. We address the stylized problem of nonparametric least squares regression with predictors chosen from a Vapnik-Chervonenkis class. This work contributes to a broader initiative to develop statistical learning theoretical foundations for supervised learning strategies that enhance performance on the supposedly heavy tails of covariates. Previous efforts in this area have focused exclusively on binary classification on extreme covariates. Although the continuous target setting necessitates different techniques and regularity assumptions, our main results echo findings from earlier studies. We quantify the predictive performance on tail regions in terms of excess risk, presenting it as a finite sample risk bound with a clear bias-variance decomposition. Numerical experiments with simulated and real data illustrate our theoretical findings.
Related papers
- Multiply Robust Conformal Risk Control with Coarsened Data [0.0]
Conformal Prediction (CP) has recently received a tremendous amount of interest.<n>In this paper, we consider the general problem of obtaining distribution-free valid prediction regions for an outcome given coarsened data.<n>Our principled use of semiparametric theory has the key advantage of facilitating flexible machine learning methods.
arXiv Detail & Related papers (2025-08-21T12:14:44Z) - Risk Bounds For Distributional Regression [9.92024586772767]
General upper bounds are established for the continuous ranked score (CRPS) and the worst-case mean squared error (MSE) across the domain.<n>Experiments on both simulated and real data validate the theoretical contributions, demonstrating their practical effectiveness.
arXiv Detail & Related papers (2025-05-14T02:22:12Z) - Local minima of the empirical risk in high dimension: General theorems and convex examples [8.748904058015574]
We consider a general model for high-dimensional empirical risk whereby the data vectors $mathbfxi$ are $d- minimization.
We derive sharps on the estimation and prediction error.
arXiv Detail & Related papers (2025-02-04T03:02:24Z) - Generalization and Robustness of the Tilted Empirical Risk [17.48212403081267]
generalization error (risk) of a supervised statistical learning algorithm quantifies its prediction ability on previously unseen data.<n>Inspired by exponential tilting, citetli 2020tilted proposed the it tilted empirical risk (TER) as a non-linear risk metric for machine learning applications.
arXiv Detail & Related papers (2024-09-28T18:31:51Z) - Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.<n>We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk.<n>We further extend our analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Nonparametric logistic regression with deep learning [1.2509746979383698]
In the nonparametric logistic regression, the Kullback-Leibler divergence could diverge easily.
Instead of analyzing the excess risk itself, it suffices to show the consistency of the maximum likelihood estimator.
As an important application, we derive the convergence rates of the NPMLE with deep neural networks.
arXiv Detail & Related papers (2024-01-23T04:31:49Z) - Optimal Excess Risk Bounds for Empirical Risk Minimization on $p$-Norm Linear Regression [19.31269916674961]
We show that, in the realizable case, under no moment assumptions, $O(d)$ samples are enough to exactly recover the target.
We extend this result to the case $p in (1, 2)$ under mild assumptions that guarantee the existence of the Hessian of the risk at its minimizer.
arXiv Detail & Related papers (2023-10-19T03:21:28Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - On the Variance, Admissibility, and Stability of Empirical Risk
Minimization [80.26309576810844]
Empirical Risk Minimization (ERM) with squared loss may attain minimax suboptimal error rates.
We show that under mild assumptions, the suboptimality of ERM must be due to large bias rather than variance.
We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes.
arXiv Detail & Related papers (2023-05-29T15:25:48Z) - Errors-in-variables Fr\'echet Regression with Low-rank Covariate
Approximation [2.1756081703276]
Fr'echet regression has emerged as a promising approach for regression analysis involving non-Euclidean response variables.
Our proposed framework combines the concepts of global Fr'echet regression and principal component regression, aiming to improve the efficiency and accuracy of the regression estimator.
arXiv Detail & Related papers (2023-05-16T08:37:54Z) - Vector-Valued Least-Squares Regression under Output Regularity
Assumptions [73.99064151691597]
We propose and analyse a reduced-rank method for solving least-squares regression problems with infinite dimensional output.
We derive learning bounds for our method, and study under which setting statistical performance is improved in comparison to full-rank method.
arXiv Detail & Related papers (2022-11-16T15:07:00Z) - Mitigating multiple descents: A model-agnostic framework for risk
monotonization [84.6382406922369]
We develop a general framework for risk monotonization based on cross-validation.
We propose two data-driven methodologies, namely zero- and one-step, that are akin to bagging and boosting.
arXiv Detail & Related papers (2022-05-25T17:41:40Z) - Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics
for Convex Losses in High-Dimension [25.711297863946193]
We develop a theory for the study of fluctuations in an ensemble of generalised linear models trained on different, but correlated, features.
We provide a complete description of the joint distribution of the empirical risk minimiser for generic convex loss and regularisation in the high-dimensional limit.
arXiv Detail & Related papers (2022-01-31T17:44:58Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Iterative Feature Matching: Toward Provable Domain Generalization with
Logarithmic Environments [55.24895403089543]
Domain generalization aims at performing well on unseen test environments with data from a limited number of training environments.
We present a new algorithm based on performing iterative feature matching that is guaranteed with high probability to yield a predictor that generalizes after seeing only $O(logd_s)$ environments.
arXiv Detail & Related papers (2021-06-18T04:39:19Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - A One-step Approach to Covariate Shift Adaptation [82.01909503235385]
A default assumption in many machine learning scenarios is that the training and test samples are drawn from the same probability distribution.
We propose a novel one-step approach that jointly learns the predictive model and the associated weights in one optimization.
arXiv Detail & Related papers (2020-07-08T11:35:47Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - Error bounds in estimating the out-of-sample prediction error using
leave-one-out cross validation in high-dimensions [19.439945058410203]
We study the problem of out-of-sample risk estimation in the high dimensional regime.
Extensive empirical evidence confirms the accuracy of leave-one-out cross validation.
One technical advantage of the theory is that it can be used to clarify and connect some results from the recent literature on scalable approximate LO.
arXiv Detail & Related papers (2020-03-03T20:07:07Z) - Weighted Empirical Risk Minimization: Sample Selection Bias Correction
based on Importance Sampling [2.599882743586164]
We consider statistical learning problems when the distribution $P'$ of the training observations $Z'_i$ differs from the distribution $P'$ involved in the risk one seeks to minimize.
We show that, in various situations frequently encountered in practice, it takes a simple form and can be directly estimated from the $Phi(z)$.
We then prove that the capacity generalization of the approach aforementioned is preserved when plugging the resulting estimates of the $Phi(Z'_i)$'s into the weighted empirical risk.
arXiv Detail & Related papers (2020-02-12T18:42:47Z) - Interpolating Predictors in High-Dimensional Factor Regression [2.1055643409860743]
This work studies finite-sample properties of the risk of the minimum-norm interpolating predictor in high-dimensional regression models.
We show that the min-norm interpolating predictor can have similar risk to predictors based on principal components regression and ridge regression, and can improve over LASSO based predictors, in the high-dimensional regime.
arXiv Detail & Related papers (2020-02-06T22:08:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.