Related papers: Robust Regression over Averaged Uncertainty

Robust Regression over Averaged Uncertainty

URL: http://arxiv.org/abs/2311.06960v1
Date: Sun, 12 Nov 2023 20:57:30 GMT
Title: Robust Regression over Averaged Uncertainty
Authors: Dimitris Bertsimas, Yu Ma
Abstract summary: We show that this formulation surprisingly recovers ridge regression and establishes the missing link between robust optimization and the mean squared error approaches for existing regression problems. We show in synthetic datasets with different levels of perturbations, a consistent improvement of the averaged formulation over the existing worst-case formulation in out-of-sample performance.
Score: 8.799946547516155
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a new formulation of robust regression by integrating all realizations of the uncertainty set and taking an averaged approach to obtain the optimal solution for the ordinary least-squared regression problem. We show that this formulation surprisingly recovers ridge regression and establishes the missing link between robust optimization and the mean squared error approaches for existing regression problems. We first prove the equivalence for four uncertainty sets: ellipsoidal, box, diamond, and budget, and provide closed-form formulations of the penalty term as a function of the sample size, feature size, as well as perturbation protection strength. We then show in synthetic datasets with different levels of perturbations, a consistent improvement of the averaged formulation over the existing worst-case formulation in out-of-sample performance. Importantly, as the perturbation level increases, the improvement increases, confirming our method's advantage in high-noise environments. We report similar improvements in the out-of-sample datasets in real-world regression problems obtained from UCI datasets.

Related papers

Semiparametric Counterfactual Regression [2.356908851188234]
We propose a doubly robust-style estimator for counterfactual regression within a generalizable framework. Our approach uses incremental interventions to enhance adaptability while maintaining with standard methods. Our analysis shows that the proposed estimators can achieve $sqrn$-consistency and normality for a broad class of problems.
arXiv Detail & Related papers (2025-04-03T15:32:26Z)
Learning a Class of Mixed Linear Regressions: Global Convergence under General Data Conditions [1.9295130374196499]
Mixed linear regression (MLR) has attracted increasing attention because of its great theoretical and practical importance in nonlinear relationships by utilizing a mixture of linear regression sub-models. Although considerable efforts have been devoted to the learning problem of such systems, most existing investigations impose the strict independent and identically distributed (i.i.d.) or distributed PE conditions.
arXiv Detail & Related papers (2025-03-24T09:57:39Z)
Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery [0.0]
We focus on distributed estimation and support recovery for high-dimensional linear quantile regression. We transform the original quantile regression into the least-squares optimization. An efficient algorithm is developed, which enjoys high computation and communication efficiency.
arXiv Detail & Related papers (2024-05-13T08:32:22Z)
Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning. We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values. We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z)
Conformal Prediction with Missing Values [19.18178194789968]
We first show that the marginal coverage guarantee of conformal prediction holds on imputed data for any missingness distribution. We then show that a universally consistent quantile regression algorithm trained on the imputed data is Bayes optimal for the pinball risk.
arXiv Detail & Related papers (2023-06-05T09:28:03Z)
Errors-in-variables Fr\'echet Regression with Low-rank Covariate Approximation [2.1756081703276]
Fr'echet regression has emerged as a promising approach for regression analysis involving non-Euclidean response variables. Our proposed framework combines the concepts of global Fr'echet regression and principal component regression, aiming to improve the efficiency and accuracy of the regression estimator.
arXiv Detail & Related papers (2023-05-16T08:37:54Z)
RIGID: Robust Linear Regression with Missing Data [7.638042073679073]
We present a robust framework to perform linear regression with missing entries in the features. We show that the proposed formulation, which naturally takes into account the dependency between different variables, reduces to a convex program. In addition to a detailed analysis, we also analyze the behavior of the proposed framework, and present technical discussions.
arXiv Detail & Related papers (2022-05-26T21:10:17Z)
Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result. Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z)
Complexity-Free Generalization via Distributionally Robust Optimization [4.313143197674466]
We present an alternate route to obtain generalization bounds on the solution from distributionally robust optimization (DRO) Our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function. Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies, to the best of our knowledge, the first generalization bound in the literature that depends solely on the true loss function.
arXiv Detail & Related papers (2021-06-21T15:19:52Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data. We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model. We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z)
Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets. Part of the challenge of learning robust models lies in the influence of unobserved confounders. We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression. The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.