DART: Diversify-Aggregate-Repeat Training Improves Generalization of
Neural Networks
- URL: http://arxiv.org/abs/2302.14685v2
- Date: Sat, 10 Jun 2023 15:11:02 GMT
- Title: DART: Diversify-Aggregate-Repeat Training Improves Generalization of
Neural Networks
- Authors: Samyak Jain, Sravanti Addepalli, Pawan Sahu, Priyam Dey and R.
Venkatesh Babu
- Abstract summary: Generalization of neural networks is crucial for deploying them safely in the real world.
In this work, we first establish a surprisingly simple but strong benchmark for generalization which utilizes diverse augmentations within a training minibatch.
We then propose Diversify-Aggregate-Repeat Training (DART) strategy that first trains diverse models using different augmentations (or domains) to explore the loss basin.
We find that Repeating the step of aggregation throughout training improves the overall optimization trajectory and also ensures that the individual models have a sufficiently low loss barrier to obtain improved generalization on combining them.
- Score: 39.69378006723682
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalization of neural networks is crucial for deploying them safely in the
real world. Common training strategies to improve generalization involve the
use of data augmentations, ensembling and model averaging. In this work, we
first establish a surprisingly simple but strong benchmark for generalization
which utilizes diverse augmentations within a training minibatch, and show that
this can learn a more balanced distribution of features. Further, we propose
Diversify-Aggregate-Repeat Training (DART) strategy that first trains diverse
models using different augmentations (or domains) to explore the loss basin,
and further Aggregates their weights to combine their expertise and obtain
improved generalization. We find that Repeating the step of Aggregation
throughout training improves the overall optimization trajectory and also
ensures that the individual models have a sufficiently low loss barrier to
obtain improved generalization on combining them. We shed light on our approach
by casting it in the framework proposed by Shen et al. and theoretically show
that it indeed generalizes better. In addition to improvements in In- Domain
generalization, we demonstrate SOTA performance on the Domain Generalization
benchmarks in the popular DomainBed framework as well. Our method is generic
and can easily be integrated with several base training algorithms to achieve
performance gains.
Related papers
- Improved Generalization Bounds for Communication Efficient Federated Learning [4.3707341422218215]
This paper focuses on reducing the communication cost of federated learning by exploring generalization bounds and representation learning.
We design a novel Federated Learning with Adaptive Local Steps (FedALS) algorithm based on our generalization bound and representation learning analysis.
arXiv Detail & Related papers (2024-04-17T21:17:48Z) - Back to Basics: A Simple Recipe for Improving Out-of-Domain Retrieval in
Dense Encoders [63.28408887247742]
We study whether training procedures can be improved to yield better generalization capabilities in the resulting models.
We recommend a simple recipe for training dense encoders: Train on MSMARCO with parameter-efficient methods, such as LoRA, and opt for using in-batch negatives unless given well-constructed hard negatives.
arXiv Detail & Related papers (2023-11-16T10:42:58Z) - Promoting Generalization for Exact Solvers via Adversarial Instance
Augmentation [62.738582127114704]
Adar is a framework for understanding and improving the generalization of both imitation-learning-based (IL-based) and reinforcement-learning-based solvers (RL-based)
arXiv Detail & Related papers (2023-10-22T03:15:36Z) - NormAUG: Normalization-guided Augmentation for Domain Generalization [60.159546669021346]
We propose a simple yet effective method called NormAUG (Normalization-guided Augmentation) for deep learning.
Our method introduces diverse information at the feature level and improves the generalization of the main path.
In the test stage, we leverage an ensemble strategy to combine the predictions from the auxiliary path of our model, further boosting performance.
arXiv Detail & Related papers (2023-07-25T13:35:45Z) - Augmentation-based Domain Generalization for Semantic Segmentation [2.179313476241343]
Unsupervised Domain Adaptation (UDA) and domain generalization (DG) aim to tackle the lack of generalization of Deep Neural Networks (DNNs) towards unseen domains.
We study the in- and out-of-domain generalization capabilities of simple, rule-based image augmentations like blur, noise, color jitter and many more.
Our experiments confirm the common scientific standard that combination of multiple different augmentations out-performs single augmentations.
arXiv Detail & Related papers (2023-04-24T14:26:53Z) - Semi-Supervised Domain Generalization with Stochastic StyleMatch [90.98288822165482]
In real-world applications, we might have only a few labels available from each source domain due to high annotation cost.
In this work, we investigate semi-supervised domain generalization, a more realistic and practical setting.
Our proposed approach, StyleMatch, is inspired by FixMatch, a state-of-the-art semi-supervised learning method based on pseudo-labeling.
arXiv Detail & Related papers (2021-06-01T16:00:08Z) - Contrastive Syn-to-Real Generalization [125.54991489017854]
We make a key observation that the diversity of the learned feature embeddings plays an important role in the generalization performance.
We propose contrastive synthetic-to-real generalization (CSG), a novel framework that leverages the pre-trained ImageNet knowledge to prevent overfitting to the synthetic domain.
We demonstrate the effectiveness of CSG on various synthetic training tasks, exhibiting state-of-the-art performance on zero-shot domain generalization.
arXiv Detail & Related papers (2021-04-06T05:10:29Z) - Rethinking Domain Generalization Baselines [21.841393368012977]
deep learning models can be brittle when deployed in scenarios different from those on which they were trained.
Data augmentation strategies have shown to be helpful tools to increase data variability, supporting model robustness across domains.
This issue open new scenarios for domain generalization research, highlighting the need of novel methods properly able to take advantage of the introduced data variability.
arXiv Detail & Related papers (2021-01-22T11:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.