On the Variance, Admissibility, and Stability of Empirical Risk
Minimization
- URL: http://arxiv.org/abs/2305.18508v1
- Date: Mon, 29 May 2023 15:25:48 GMT
- Title: On the Variance, Admissibility, and Stability of Empirical Risk
Minimization
- Authors: Gil Kur, Eli Putterman and Alexander Rakhlin
- Abstract summary: Empirical Risk Minimization (ERM) with squared loss may attain minimax suboptimal error rates.
We show that under mild assumptions, the suboptimality of ERM must be due to large bias rather than variance.
We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes.
- Score: 80.26309576810844
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is well known that Empirical Risk Minimization (ERM) with squared loss may
attain minimax suboptimal error rates (Birg\'e and Massart, 1993). The key
message of this paper is that, under mild assumptions, the suboptimality of ERM
must be due to large bias rather than variance. More precisely, in the
bias-variance decomposition of the squared error of the ERM, the variance term
necessarily enjoys the minimax rate. In the case of fixed design, we provide an
elementary proof of this fact using the probabilistic method. Then, we prove
this result for various models in the random design setting. In addition, we
provide a simple proof of Chatterjee's admissibility theorem (Chatterjee, 2014,
Theorem 1.4), which states that ERM cannot be ruled out as an optimal method,
in the fixed design setting, and extend this result to the random design
setting. We also show that our estimates imply stability of ERM, complementing
the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes.
Finally, we show that for non-Donsker classes, there are functions close to the
ERM, yet far from being almost-minimizers of the empirical loss, highlighting
the somewhat irregular nature of the loss landscape.
Related papers
- Revisiting Essential and Nonessential Settings of Evidential Deep Learning [70.82728812001807]
Evidential Deep Learning (EDL) is an emerging method for uncertainty estimation.
We propose Re-EDL, a simplified yet more effective variant of EDL.
arXiv Detail & Related papers (2024-10-01T04:27:07Z) - Nonparametric logistic regression with deep learning [1.2509746979383698]
In the nonparametric logistic regression, the Kullback-Leibler divergence could diverge easily.
Instead of analyzing the excess risk itself, it suffices to show the consistency of the maximum likelihood estimator.
As an important application, we derive the convergence rates of the NPMLE with deep neural networks.
arXiv Detail & Related papers (2024-01-23T04:31:49Z) - Efficient Stochastic Approximation of Minimax Excess Risk Optimization [36.68685001551774]
We develop efficient approximation approaches which directly target MERO.
We demonstrate that the bias, caused by the estimation error of the minimal risk, is under-control.
We also investigate a practical scenario where the quantity of samples drawn from each distribution may differ, and propose an approach that delivers distribution-dependent convergence rates.
arXiv Detail & Related papers (2023-05-31T02:21:11Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - On the Minimal Error of Empirical Risk Minimization [90.09093901700754]
We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression.
Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data.
arXiv Detail & Related papers (2021-02-24T04:47:55Z) - Does Invariant Risk Minimization Capture Invariance? [23.399091822468407]
We show that the Invariant Risk Minimization (IRM) formulation of Arjovsky et al. can fail to capture "natural" invariances.
This can lead to worse generalization on new environments.
arXiv Detail & Related papers (2021-01-04T18:02:45Z) - The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z) - Unbiased Risk Estimators Can Mislead: A Case Study of Learning with
Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels.
We show that the quality of gradient estimation matters more in risk minimization.
We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.