Related papers: Statistical Robustness of Interval CVaR Based Regression Models under Perturbation and Contamination

Statistical Robustness of Interval CVaR Based Regression Models under Perturbation and Contamination

URL: http://arxiv.org/abs/2601.11420v1
Date: Fri, 16 Jan 2026 16:41:57 GMT
Title: Statistical Robustness of Interval CVaR Based Regression Models under Perturbation and Contamination
Authors: Yulei You, Junyi Liu,
Abstract summary: We address the robust nonlinear regression based on the so-called interval conditional value-at-risk (In-CVaR)<n>We rigorously quantify robustness under contamination, with a unified study of distributional breakdown point for a broad class of regression models.<n>We show that the In-CVaR based estimator is qualitatively robust in terms of the Prokhorov metric if and only if the largest portion of losses is trimmed.
Score: 1.578201299411112
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robustness under perturbation and contamination is a prominent issue in statistical learning. We address the robust nonlinear regression based on the so-called interval conditional value-at-risk (In-CVaR), which is introduced to enhance robustness by trimming extreme losses. While recent literature shows that the In-CVaR based statistical learning exhibits superior robustness performance than classical robust regression models, its theoretical robustness analysis for nonlinear regression remains largely unexplored. We rigorously quantify robustness under contamination, with a unified study of distributional breakdown point for a broad class of regression models, including linear, piecewise affine and neural network models with $\ell_1$, $\ell_2$ and Huber losses. Moreover, we analyze the qualitative robustness of the In-CVaR based estimator under perturbation. We show that under several minor assumptions, the In-CVaR based estimator is qualitatively robust in terms of the Prokhorov metric if and only if the largest portion of losses is trimmed. Overall, this study analyzes robustness properties of In-CVaR based nonlinear regression models under both perturbation and contamination, which illustrates the advantages of In-CVaR risk measure over conditional value-at-risk and expectation for robust regression in both theory and numerical experiments.

Related papers

On the Generalization and Robustness in Conditional Value-at-Risk [12.253712889424584]
We develop a learning-theoretic analysis of Conditional Value-at-Risk (CVaR)-based empirical risk minimization under heavy-tailed and contaminated data.<n>We establish sharp, high-probability generalization and excess risk bounds under minimal moment assumptions.<n>We show that CVaR decisions themselves can be intrinsically unstable under heavy tails.
arXiv Detail & Related papers (2026-02-20T08:10:11Z)
TraCeR: Transformer-Based Competing Risk Analysis with Longitudinal Covariates [0.0]
TraCeR is a transformer-based survival analysis framework.<n>It estimates the hazard function from a sequence of measurements.<n>Experiments on multiple real-world datasets demonstrate substantial and statistically significant performance improvements.
arXiv Detail & Related papers (2025-12-19T23:24:47Z)
Risk-Averse Certification of Bayesian Neural Networks [70.44969603471903]
We propose a Risk-Averse Certification framework for Bayesian neural networks called RAC-BNN.<n>Our method leverages sampling and optimisation to compute a sound approximation of the output set of a BNN.<n>We validate RAC-BNN on a range of regression and classification benchmarks and compare its performance with a state-of-the-art method.
arXiv Detail & Related papers (2024-11-29T14:22:51Z)
Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.<n>We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk.<n>We further extend our analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting.
arXiv Detail & Related papers (2024-08-08T17:27:29Z)
Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point. Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z)
Deep Learning Based Residuals in Non-linear Factor Models: Precision Matrix Estimation of Returns with Low Signal-to-Noise Ratio [0.0]
This paper introduces a consistent estimator and rate of convergence for the precision matrix of asset returns in large portfolios. Our estimator remains valid even in low signal-to-noise ratio environments typical for financial markets.
arXiv Detail & Related papers (2022-09-09T20:29:54Z)
Are Latent Factor Regression and Sparse Regression Adequate? [0.49416305961918056]
We provide theoretical guarantees for the estimation of our model under the existence of sub-Gaussian and heavy-tailed noises. We propose the Factor-Adjusted de-Biased Test (FabTest) and a two-stage ANOVA type test respectively. Numerical results illustrate the robustness and effectiveness of our model against latent factor regression and sparse linear regression models.
arXiv Detail & Related papers (2022-03-02T16:22:23Z)
Regularized Modal Regression on Markov-dependent Observations: A Theoretical Assessment [13.852720406291875]
This paper concerns the statistical property of regularized modal regression (RMR) within an important dependence structure - Markov dependent. We establish the upper bound for RMR estimator under moderate conditions and give an explicit learning rate. Our results show that the Markov dependence impacts on the generalization error in the way that sample size would be discounted by a multiplicative factor depending on the spectral gap of underlying Markov chain.
arXiv Detail & Related papers (2021-12-09T09:08:52Z)
The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer Linear Networks [51.1848572349154]
neural network models that perfectly fit noisy data can generalize well to unseen test data. We consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk.
arXiv Detail & Related papers (2021-08-25T22:01:01Z)
Bayesian Uncertainty Estimation of Learned Variational MRI Reconstruction [63.202627467245584]
We introduce a Bayesian variational framework to quantify the model-immanent (epistemic) uncertainty. We demonstrate that our approach yields competitive results for undersampled MRI reconstruction.
arXiv Detail & Related papers (2021-02-12T18:08:14Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.