On the Minimal Error of Empirical Risk Minimization
- URL: http://arxiv.org/abs/2102.12066v1
- Date: Wed, 24 Feb 2021 04:47:55 GMT
- Title: On the Minimal Error of Empirical Risk Minimization
- Authors: Gil Kur, Alexander Rakhlin
- Abstract summary: We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression.
Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data.
- Score: 90.09093901700754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the minimal error of the Empirical Risk Minimization (ERM) procedure
in the task of regression, both in the random and the fixed design settings.
Our sharp lower bounds shed light on the possibility (or impossibility) of
adapting to simplicity of the model generating the data. In the fixed design
setting, we show that the error is governed by the global complexity of the
entire class. In contrast, in random design, ERM may only adapt to simpler
models if the local neighborhoods around the regression function are nearly as
complex as the class itself, a somewhat counter-intuitive conclusion. We
provide sharp lower bounds for performance of ERM for both Donsker and
non-Donsker classes. We also discuss our results through the lens of recent
studies on interpolation in overparameterized models.
Related papers
- Rethinking generalization of classifiers in separable classes scenarios and over-parameterized regimes [0.0]
We show that in separable classes scenarios the proportion of "bad" global minima diminishes exponentially with the number of training data n.
We propose a model for the density distribution of the true error, yielding learning curves that align with experiments on MNIST and CIFAR-10.
arXiv Detail & Related papers (2024-10-22T10:12:57Z) - Inverse Reinforcement Learning with Unknown Reward Model based on
Structural Risk Minimization [9.44879308639364]
Inverse reinforcement learning (IRL) usually assumes the model of the reward function is pre-specified and estimates the parameter only.
A simplistic model is less likely to contain the real reward function, while a model with high complexity leads to substantial cost and risks overfitting.
This paper addresses this trade-off by introducing the structural risk minimization (SRM) method from statistical learning.
arXiv Detail & Related papers (2023-12-27T13:23:17Z) - On the Variance, Admissibility, and Stability of Empirical Risk
Minimization [80.26309576810844]
Empirical Risk Minimization (ERM) with squared loss may attain minimax suboptimal error rates.
We show that under mild assumptions, the suboptimality of ERM must be due to large bias rather than variance.
We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes.
arXiv Detail & Related papers (2023-05-29T15:25:48Z) - A Model-Based Method for Minimizing CVaR and Beyond [7.751691910877239]
We develop a variant of the prox-linear method for minimizing the Conditional Value-at-Risk (CVaR) objective.
CVaR is a risk measure focused on minimizing worst-case performance, defined as the average of the top quantile of the losses.
In machine learning, such a risk measure is useful to train more robust models.
arXiv Detail & Related papers (2023-05-27T15:38:53Z) - DAIR: Data Augmented Invariant Regularization [20.364846667289374]
In this paper, we propose data augmented invariant regularization (DAIR)
We show that a particular form of the DAIR regularizer consistently performs well in a variety of settings.
We apply it to multiple real-world learning problems involving domain shift.
arXiv Detail & Related papers (2021-10-21T15:30:40Z) - A Wasserstein Minimax Framework for Mixed Linear Regression [69.40394595795544]
Multi-modal distributions are commonly used to model clustered data in learning tasks.
We propose an optimal transport-based framework for Mixed Linear Regression problems.
arXiv Detail & Related papers (2021-06-14T16:03:51Z) - The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z) - Unbiased Risk Estimators Can Mislead: A Case Study of Learning with
Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels.
We show that the quality of gradient estimation matters more in risk minimization.
We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z) - MMCGAN: Generative Adversarial Network with Explicit Manifold Prior [78.58159882218378]
We propose to employ explicit manifold learning as prior to alleviate mode collapse and stabilize training of GAN.
Our experiments on both the toy data and real datasets show the effectiveness of MMCGAN in alleviating mode collapse, stabilizing training, and improving the quality of generated samples.
arXiv Detail & Related papers (2020-06-18T07:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.