DAIR: Data Augmented Invariant Regularization
- URL: http://arxiv.org/abs/2110.11205v1
- Date: Thu, 21 Oct 2021 15:30:40 GMT
- Title: DAIR: Data Augmented Invariant Regularization
- Authors: Tianjian Huang and Shaunak Halbe and Chinnadhurai Sankar and Pooyan
Amini and Satwik Kottur and Alborz Geramifard and Meisam Razaviyayn and Ahmad
Beirami
- Abstract summary: In this paper, we propose data augmented invariant regularization (DAIR)
We show that a particular form of the DAIR regularizer consistently performs well in a variety of settings.
We apply it to multiple real-world learning problems involving domain shift.
- Score: 20.364846667289374
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While deep learning through empirical risk minimization (ERM) has succeeded
at achieving human-level performance at a variety of complex tasks, ERM
generalizes poorly to distribution shift. This is partly explained by
overfitting to spurious features such as background in images or named entities
in natural language. Synthetic data augmentation followed by empirical risk
minimization (DA-ERM) is a simple yet powerful solution to remedy this problem.
In this paper, we propose data augmented invariant regularization (DAIR). The
idea of DAIR is based on the observation that the model performance (loss) is
desired to be consistent on the augmented sample and the original one. DAIR
introduces a regularizer on DA-ERM to penalize such loss inconsistency. Both
theoretically and through empirical experiments, we show that a particular form
of the DAIR regularizer consistently performs well in a variety of settings. We
apply it to multiple real-world learning problems involving domain shift,
namely robust regression, visual question answering, robust deep neural network
training, and task-oriented dialog modeling. Our experiments show that DAIR
consistently outperforms ERM and DA-ERM with little marginal cost and setting
new state-of-the-art results in several benchmarks.
Related papers
- Investigating the Impact of Model Complexity in Large Language Models [3.7919508292745676]
Large Language Models (LLMs) based on the pre-trained fine-tuning paradigm have become pivotal in solving natural language processing tasks.
In this paper, we focus on autoregressive LLMs and propose to employ Hidden Markov Models (HMMs) to model them.
arXiv Detail & Related papers (2024-10-01T13:53:44Z) - Frustratingly Easy Model Generalization by Dummy Risk Minimization [38.67678021055096]
Dummy Risk Minimization (DuRM) is a frustratingly easy and general technique to improve the generalization of Empirical risk minimization (ERM)
We show that DuRM could consistently improve the performance under all tasks with an almost free lunch manner.
arXiv Detail & Related papers (2023-08-04T12:43:54Z) - Coping with Change: Learning Invariant and Minimum Sufficient
Representations for Fine-Grained Visual Categorization [26.254072665916155]
Fine-grained visual categorization (FGVC) is a challenging task due to similar visual appearances between various species.
Previous studies assume that the training and test data have the same underlying distributions, and that features extracted by modern backbone architectures remain discriminative and generalize well to unseen test data.
We combine the merits of invariant risk minimization (IRM) and information bottleneck (IB) principle to learn invariant and minimum sufficient (IMS) representations for FGVC.
arXiv Detail & Related papers (2023-06-08T02:45:15Z) - Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization [61.39201891894024]
Group distributionally robust optimization (group DRO) can minimize the worst-case loss over pre-defined groups.
We reformulate the group DRO framework by proposing Q-Diversity.
Characterized by an interactive training mode, Q-Diversity relaxes the group identification from annotation into direct parameterization.
arXiv Detail & Related papers (2023-05-20T07:02:27Z) - Learning Optimal Features via Partial Invariance [18.552839725370383]
Invariant Risk Minimization (IRM) is a popular framework that aims to learn robust models from multiple environments.
We show that IRM can over-constrain the predictor and to remedy this, we propose a relaxation via $textitpartial invariance$.
Several experiments, conducted both in linear settings as well as with deep neural networks on tasks over both language and image data, allow us to verify our conclusions.
arXiv Detail & Related papers (2023-01-28T02:48:14Z) - Discrete Auto-regressive Variational Attention Models for Text Modeling [53.38382932162732]
Variational autoencoders (VAEs) have been widely applied for text modeling.
They are troubled by two challenges: information underrepresentation and posterior collapse.
We propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges.
arXiv Detail & Related papers (2021-06-16T06:36:26Z) - Contrastive Model Inversion for Data-Free Knowledge Distillation [60.08025054715192]
We propose Contrastive Model Inversion, where the data diversity is explicitly modeled as an optimizable objective.
Our main observation is that, under the constraint of the same amount of data, higher data diversity usually indicates stronger instance discrimination.
Experiments on CIFAR-10, CIFAR-100, and Tiny-ImageNet demonstrate that CMI achieves significantly superior performance when the generated data are used for knowledge distillation.
arXiv Detail & Related papers (2021-05-18T15:13:00Z) - Meta-Learned Invariant Risk Minimization [12.6484257912092]
Empirical Risk Minimization (ERM) based machine learning algorithms have suffered from weak generalization performance on data obtained from out-of-distribution (OOD)
In this paper, we propose a novel meta-learning based approach for IRM.
We show that our algorithm not only has better OOD generalization performance than IRMv1 and all IRM variants, but also addresses the weakness of IRMv1 with improved stability.
arXiv Detail & Related papers (2021-03-24T02:52:48Z) - On the Minimal Error of Empirical Risk Minimization [90.09093901700754]
We study the minimal error of the Empirical Risk Minimization (ERM) procedure in the task of regression.
Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data.
arXiv Detail & Related papers (2021-02-24T04:47:55Z) - MMCGAN: Generative Adversarial Network with Explicit Manifold Prior [78.58159882218378]
We propose to employ explicit manifold learning as prior to alleviate mode collapse and stabilize training of GAN.
Our experiments on both the toy data and real datasets show the effectiveness of MMCGAN in alleviating mode collapse, stabilizing training, and improving the quality of generated samples.
arXiv Detail & Related papers (2020-06-18T07:38:54Z) - SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples.
We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.