Learn to Expect the Unexpected: Probably Approximately Correct Domain
Generalization
- URL: http://arxiv.org/abs/2002.05660v1
- Date: Thu, 13 Feb 2020 17:37:53 GMT
- Title: Learn to Expect the Unexpected: Probably Approximately Correct Domain
Generalization
- Authors: Vikas K. Garg, Adam Kalai, Katrina Ligett, and Zhiwei Steven Wu
- Abstract summary: Domain generalization is the problem of machine learning when the training data and the test data come from different data domains.
We present a simple theoretical model of learning to generalize across domains in which there is a meta-distribution over data distributions.
- Score: 38.345670899258515
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Domain generalization is the problem of machine learning when the training
data and the test data come from different data domains. We present a simple
theoretical model of learning to generalize across domains in which there is a
meta-distribution over data distributions, and those data distributions may
even have different supports. In our model, the training data given to a
learning algorithm consists of multiple datasets each from a single domain
drawn in turn from the meta-distribution. We study this model in three
different problem settings---a multi-domain Massart noise setting, a decision
tree multi-dataset setting, and a feature selection setting, and find that
computationally efficient, polynomial-sample domain generalization is possible
in each. Experiments demonstrate that our feature selection algorithm indeed
ignores spurious correlations and improves generalization.
Related papers
- Domain Adversarial Active Learning for Domain Generalization
Classification [8.003401798449337]
Domain generalization models aim to learn cross-domain knowledge from source domain data, to improve performance on unknown target domains.
Recent research has demonstrated that diverse and rich source domain samples can enhance domain generalization capability.
We propose a domain-adversarial active learning (DAAL) algorithm for classification tasks in domain generalization.
arXiv Detail & Related papers (2024-03-10T10:59:22Z) - Multiply Robust Estimation for Local Distribution Shifts with Multiple Domains [9.429772474335122]
We focus on scenarios where data distributions vary across multiple segments of the entire population.
We propose a two-stage multiply robust estimation method to improve model performance on each individual segment.
Our method is designed to be implemented with commonly used off-the-shelf machine learning models.
arXiv Detail & Related papers (2024-02-21T22:01:10Z) - Multi-Domain Long-Tailed Learning by Augmenting Disentangled
Representations [80.76164484820818]
There is an inescapable long-tailed class-imbalance issue in many real-world classification problems.
We study this multi-domain long-tailed learning problem and aim to produce a model that generalizes well across all classes and domains.
Built upon a proposed selective balanced sampling strategy, TALLY achieves this by mixing the semantic representation of one example with the domain-associated nuisances of another.
arXiv Detail & Related papers (2022-10-25T21:54:26Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Domain Adaptation Principal Component Analysis: base linear method for
learning with out-of-distribution data [55.41644538483948]
Domain adaptation is a popular paradigm in modern machine learning.
We present a method called Domain Adaptation Principal Component Analysis (DAPCA)
DAPCA finds a linear reduced data representation useful for solving the domain adaptation task.
arXiv Detail & Related papers (2022-08-28T21:10:56Z) - Improving Multi-Domain Generalization through Domain Re-labeling [31.636953426159224]
We study the important link between pre-specified domain labels and the generalization performance.
We introduce a general approach for multi-domain generalization, MulDEns, that uses an ERM-based deep ensembling backbone.
We show that MulDEns does not require tailoring the augmentation strategy or the training process specific to a dataset.
arXiv Detail & Related papers (2021-12-17T23:21:50Z) - Towards Data-Free Domain Generalization [12.269045654957765]
How can knowledge contained in models trained on different source data domains be merged into a single model that generalizes well to unseen target domains?
Prior domain generalization methods typically rely on using source domain data, making them unsuitable for private decentralized data.
We propose DEKAN, an approach that extracts and fuses domain-specific knowledge from the available teacher models into a student model robust to domain shift.
arXiv Detail & Related papers (2021-10-09T11:44:05Z) - Self-balanced Learning For Domain Generalization [64.99791119112503]
Domain generalization aims to learn a prediction model on multi-domain source data such that the model can generalize to a target domain with unknown statistics.
Most existing approaches have been developed under the assumption that the source data is well-balanced in terms of both domain and class.
We propose a self-balanced domain generalization framework that adaptively learns the weights of losses to alleviate the bias caused by different distributions of the multi-domain source data.
arXiv Detail & Related papers (2021-08-31T03:17:54Z) - Batch Normalization Embeddings for Deep Domain Generalization [50.51405390150066]
Domain generalization aims at training machine learning models to perform robustly across different and unseen domains.
We show a significant increase in classification accuracy over current state-of-the-art techniques on popular domain generalization benchmarks.
arXiv Detail & Related papers (2020-11-25T12:02:57Z) - In Search of Lost Domain Generalization [25.43757332883202]
We implement DomainBed, a testbed for domain generalization.
We conduct extensive experiments using DomainBed and find that, when carefully implemented, empirical risk minimization shows state-of-the-art performance.
arXiv Detail & Related papers (2020-07-02T23:08:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.