Robust Regression over Averaged Uncertainty
- URL: http://arxiv.org/abs/2311.06960v2
- Date: Wed, 09 Oct 2024 12:16:15 GMT
- Title: Robust Regression over Averaged Uncertainty
- Authors: Dimitris Bertsimas, Yu Ma,
- Abstract summary: We show that this formulation recovers ridge regression exactly and establishes the missing link between robust optimization and the mean squared error approaches for existing regression problems.
We provide exact, closed-form, in some cases, analytical solutions to the equivalent regularization strength under uncertainty sets induced by $ell_p$ norm, Schatten $p$-norm, and general polytopes.
- Score: 7.4489490661717355
- License:
- Abstract: We propose a new formulation of robust regression by integrating all realizations of the uncertainty set and taking an averaged approach to obtain the optimal solution for the ordinary least squares regression problem. We show that this formulation recovers ridge regression exactly and establishes the missing link between robust optimization and the mean squared error approaches for existing regression problems. We further demonstrate that the condition of this equivalence relies on the geometric properties of the defined uncertainty set. We provide exact, closed-form, in some cases, analytical solutions to the equivalent regularization strength under uncertainty sets induced by $\ell_p$ norm, Schatten $p$-norm, and general polytopes. We then show in synthetic datasets with different levels of uncertainties, a consistent improvement of the averaged formulation over the existing worst-case formulation in out-of-sample performance. In real-world regression problems obtained from UCI datasets, similar improvements are seen in the out-of-sample datasets.
Related papers
- Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery [0.0]
We focus on distributed estimation and support recovery for high-dimensional linear quantile regression.
We transform the original quantile regression into the least-squares optimization.
An efficient algorithm is developed, which enjoys high computation and communication efficiency.
arXiv Detail & Related papers (2024-05-13T08:32:22Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Conformal Prediction with Missing Values [19.18178194789968]
We first show that the marginal coverage guarantee of conformal prediction holds on imputed data for any missingness distribution.
We then show that a universally consistent quantile regression algorithm trained on the imputed data is Bayes optimal for the pinball risk.
arXiv Detail & Related papers (2023-06-05T09:28:03Z) - Errors-in-variables Fr\'echet Regression with Low-rank Covariate
Approximation [2.1756081703276]
Fr'echet regression has emerged as a promising approach for regression analysis involving non-Euclidean response variables.
Our proposed framework combines the concepts of global Fr'echet regression and principal component regression, aiming to improve the efficiency and accuracy of the regression estimator.
arXiv Detail & Related papers (2023-05-16T08:37:54Z) - RIGID: Robust Linear Regression with Missing Data [7.638042073679073]
We present a robust framework to perform linear regression with missing entries in the features.
We show that the proposed formulation, which naturally takes into account the dependency between different variables, reduces to a convex program.
In addition to a detailed analysis, we also analyze the behavior of the proposed framework, and present technical discussions.
arXiv Detail & Related papers (2022-05-26T21:10:17Z) - Trustworthy Multimodal Regression with Mixture of Normal-inverse Gamma
Distributions [91.63716984911278]
We introduce a novel Mixture of Normal-Inverse Gamma distributions (MoNIG) algorithm, which efficiently estimates uncertainty in principle for adaptive integration of different modalities and produces a trustworthy regression result.
Experimental results on both synthetic and different real-world data demonstrate the effectiveness and trustworthiness of our method on various multimodal regression tasks.
arXiv Detail & Related papers (2021-11-11T14:28:12Z) - Complexity-Free Generalization via Distributionally Robust Optimization [4.313143197674466]
We present an alternate route to obtain generalization bounds on the solution from distributionally robust optimization (DRO)
Our DRO bounds depend on the ambiguity set geometry and its compatibility with the true loss function.
Notably, when using maximum mean discrepancy as a DRO distance metric, our analysis implies, to the best of our knowledge, the first generalization bound in the literature that depends solely on the true loss function.
arXiv Detail & Related papers (2021-06-21T15:19:52Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Approximation Schemes for ReLU Regression [80.33702497406632]
We consider the fundamental problem of ReLU regression.
The goal is to output the best fitting ReLU with respect to square loss given to draws from some unknown distribution.
arXiv Detail & Related papers (2020-05-26T16:26:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.