Frustratingly Easy Model Generalization by Dummy Risk Minimization
- URL: http://arxiv.org/abs/2308.02287v2
- Date: Sat, 7 Oct 2023 05:53:30 GMT
- Title: Frustratingly Easy Model Generalization by Dummy Risk Minimization
- Authors: Juncheng Wang, Jindong Wang, Xixu Hu, Shujun Wang, Xing Xie
- Abstract summary: Dummy Risk Minimization (DuRM) is a frustratingly easy and general technique to improve the generalization of Empirical risk minimization (ERM)
We show that DuRM could consistently improve the performance under all tasks with an almost free lunch manner.
- Score: 38.67678021055096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Empirical risk minimization (ERM) is a fundamental machine learning paradigm.
However, its generalization ability is limited in various tasks. In this paper,
we devise Dummy Risk Minimization (DuRM), a frustratingly easy and general
technique to improve the generalization of ERM. DuRM is extremely simple to
implement: just enlarging the dimension of the output logits and then
optimizing using standard gradient descent. Moreover, we validate the efficacy
of DuRM on both theoretical and empirical analysis. Theoretically, we show that
DuRM derives greater variance of the gradient, which facilitates model
generalization by observing better flat local minima. Empirically, we conduct
evaluations of DuRM across different datasets, modalities, and network
architectures on diverse tasks, including conventional classification, semantic
segmentation, out-of-distribution generalization, adverserial training, and
long-tailed recognition. Results demonstrate that DuRM could consistently
improve the performance under all tasks with an almost free lunch manner.
Furthermore, we show that DuRM is compatible with existing generalization
techniques and we discuss possible limitations. We hope that DuRM could trigger
new interest in the fundamental research on risk minimization.
Related papers
- Invariant Risk Minimization Is A Total Variation Model [3.000494957386027]
Invariant risk minimization (IRM) is an arising approach to generalize invariant features to different environments in machine learning.
We show that IRM is essentially a total variation based on $L2$ (TV-$ell$) of the learning risk.
We propose a novel IRM framework based on the TV-$ell$ model.
arXiv Detail & Related papers (2024-05-02T15:34:14Z) - On the Variance, Admissibility, and Stability of Empirical Risk
Minimization [80.26309576810844]
Empirical Risk Minimization (ERM) with squared loss may attain minimax suboptimal error rates.
We show that under mild assumptions, the suboptimality of ERM must be due to large bias rather than variance.
We also show that our estimates imply stability of ERM, complementing the main result of Caponnetto and Rakhlin (2006) for non-Donsker classes.
arXiv Detail & Related papers (2023-05-29T15:25:48Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - What Is Missing in IRM Training and Evaluation? Challenges and Solutions [41.56612265456626]
Invariant risk minimization (IRM) has received increasing attention as a way to acquire environment-agnostic data representations and predictions.
Recent works have found that the optimality of the originally-proposed IRM optimization (IRM) may be compromised in practice.
We identify and resolve three practical limitations in IRM training and evaluation.
arXiv Detail & Related papers (2023-03-04T07:06:24Z) - Pareto Invariant Risk Minimization [32.01775861630696]
We propose a new optimization scheme for invariant risk minimization (IRM) called PAreto Invariant Risk Minimization (PAIR)
We show PAIR can empower the practical IRM variants to overcome the barriers with the original IRM when provided with proper guidance.
arXiv Detail & Related papers (2022-06-15T19:04:02Z) - Hierarchies of Reward Machines [75.55324974788475]
Reward machines (RMs) are a recent formalism for representing the reward function of a reinforcement learning task through a finite-state machine.
We propose a formalism for further abstracting the subtask structure by endowing an RM with the ability to call other RMs.
arXiv Detail & Related papers (2022-05-31T12:39:24Z) - The Missing Invariance Principle Found -- the Reciprocal Twin of
Invariant Risk Minimization [7.6146285961466]
In Risk Minimization (IRM) can fail to generalize poorly to out-of-distribution (OOD) data.
We show that MRI-v1 can guarantee invariant predictors given sufficient environments.
We also demonstrate that MRI strongly out-performs IRM and achieves a near-optimal OOD in image-based problems.
arXiv Detail & Related papers (2022-05-29T00:14:51Z) - DAIR: Data Augmented Invariant Regularization [20.364846667289374]
In this paper, we propose data augmented invariant regularization (DAIR)
We show that a particular form of the DAIR regularizer consistently performs well in a variety of settings.
We apply it to multiple real-world learning problems involving domain shift.
arXiv Detail & Related papers (2021-10-21T15:30:40Z) - On the Minimal Error of Empirical Risk Minimization [90.09093901700754]
We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression.
Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data.
arXiv Detail & Related papers (2021-02-24T04:47:55Z) - MMCGAN: Generative Adversarial Network with Explicit Manifold Prior [78.58159882218378]
We propose to employ explicit manifold learning as prior to alleviate mode collapse and stabilize training of GAN.
Our experiments on both the toy data and real datasets show the effectiveness of MMCGAN in alleviating mode collapse, stabilizing training, and improving the quality of generated samples.
arXiv Detail & Related papers (2020-06-18T07:38:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.