Not to Overfit or Underfit? A Study of Domain Generalization in Question
Answering
- URL: http://arxiv.org/abs/2205.07257v1
- Date: Sun, 15 May 2022 10:53:40 GMT
- Title: Not to Overfit or Underfit? A Study of Domain Generalization in Question
Answering
- Authors: Md Arafat Sultan, Avirup Sil and Radu Florian
- Abstract summary: Machine learning models are prone to overfitting their source (training) distributions.
Here we examine the contrasting view that multi-source domain generalization (DG) is in fact a problem of mitigating source domain underfitting.
- Score: 18.22045610080848
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models are prone to overfitting their source (training)
distributions, which is commonly believed to be why they falter in novel target
domains. Here we examine the contrasting view that multi-source domain
generalization (DG) is in fact a problem of mitigating source domain
underfitting: models not adequately learning the signal in their multi-domain
training data. Experiments on a reading comprehension DG benchmark show that as
a model gradually learns its source domains better -- using known methods such
as knowledge distillation from a larger model -- its zero-shot out-of-domain
accuracy improves at an even faster rate. Improved source domain learning also
demonstrates superior generalization over three popular domain-invariant
learning methods that aim to counter overfitting.
Related papers
- Improving Generalization with Domain Convex Game [32.07275105040802]
Domain generalization tends to alleviate the poor generalization capability of deep neural networks by learning model with multiple source domains.
A classical solution to DG is domain augmentation, the common belief of which is that diversifying source domains will be conducive to the out-of-distribution generalization.
Our explorations reveal that the correlation between model generalization and the diversity of domains may be not strictly positive, which limits the effectiveness of domain augmentation.
arXiv Detail & Related papers (2023-03-23T14:27:49Z) - Improving Domain Generalization with Domain Relations [77.63345406973097]
This paper focuses on domain shifts, which occur when the model is applied to new domains that are different from the ones it was trained on.
We propose a new approach called D$3$G to learn domain-specific models.
Our results show that D$3$G consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-02-06T08:11:16Z) - Normalization Perturbation: A Simple Domain Generalization Method for
Real-World Domain Shifts [133.99270341855728]
Real-world domain styles can vary substantially due to environment changes and sensor noises.
Deep models only know the training domain style.
We propose Normalization Perturbation to overcome this domain style overfitting problem.
arXiv Detail & Related papers (2022-11-08T17:36:49Z) - Forget Less, Count Better: A Domain-Incremental Self-Distillation
Learning Benchmark for Lifelong Crowd Counting [51.44987756859706]
Off-the-shelf methods have some drawbacks to handle multiple domains.
Lifelong Crowd Counting aims at alleviating the catastrophic forgetting and improving the generalization ability.
arXiv Detail & Related papers (2022-05-06T15:37:56Z) - Domain Generalization by Mutual-Information Regularization with
Pre-trained Models [20.53534134966378]
Domain generalization (DG) aims to learn a generalized model to an unseen target domain using only limited source domains.
We re-formulate the DG objective using mutual information with the oracle model, a model generalized to any possible domain.
Our experiments show that Mutual Information Regularization with Oracle (MIRO) significantly improves the out-of-distribution performance.
arXiv Detail & Related papers (2022-03-21T08:07:46Z) - A Novel Mix-normalization Method for Generalizable Multi-source Person
Re-identification [49.548815417844786]
Person re-identification (Re-ID) has achieved great success in the supervised scenario.
It is difficult to directly transfer the supervised model to arbitrary unseen domains due to the model overfitting to the seen source domains.
We propose MixNorm, which consists of domain-aware mix-normalization (DMN) and domain-ware center regularization (DCR)
arXiv Detail & Related papers (2022-01-24T18:09:38Z) - Self-balanced Learning For Domain Generalization [64.99791119112503]
Domain generalization aims to learn a prediction model on multi-domain source data such that the model can generalize to a target domain with unknown statistics.
Most existing approaches have been developed under the assumption that the source data is well-balanced in terms of both domain and class.
We propose a self-balanced domain generalization framework that adaptively learns the weights of losses to alleviate the bias caused by different distributions of the multi-domain source data.
arXiv Detail & Related papers (2021-08-31T03:17:54Z) - Domain Generalization using Ensemble Learning [0.0]
We tackle the problem of a model's weak generalization when it is trained on a single source domain.
From this perspective, we build an ensemble model on top of base deep learning models trained on a single source to enhance the generalization of their collective prediction.
arXiv Detail & Related papers (2021-03-18T13:50:36Z) - FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation [26.929772844572213]
We introduce a fixed ratio-based mixup to augment multiple intermediate domains between the source and target domain.
We train the source-dominant model and the target-dominant model that have complementary characteristics.
Through our proposed methods, the models gradually transfer domain knowledge from the source to the target domain.
arXiv Detail & Related papers (2020-11-18T11:58:19Z) - Learning to Generate Novel Domains for Domain Generalization [115.21519842245752]
This paper focuses on the task of learning from multiple source domains a model that generalizes well to unseen domains.
We employ a data generator to synthesize data from pseudo-novel domains to augment the source domains.
Our method, L2A-OT, outperforms current state-of-the-art DG methods on four benchmark datasets.
arXiv Detail & Related papers (2020-07-07T09:34:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.