Disentanglement and Generalization Under Correlation Shifts
- URL: http://arxiv.org/abs/2112.14754v1
- Date: Wed, 29 Dec 2021 18:55:17 GMT
- Title: Disentanglement and Generalization Under Correlation Shifts
- Authors: Christina M. Funke, Paul Vicol, Kuan-Chieh Wang, Matthias K\"ummerer,
Richard Zemel and Matthias Bethge
- Abstract summary: Correlations between factors of variation are prevalent in real-world data.
Machine learning algorithms may benefit from exploiting such correlations, as they can increase predictive performance on noisy data.
We aim to learn representations which capture different factors of variation in latent subspaces.
- Score: 22.499106910581958
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Correlations between factors of variation are prevalent in real-world data.
Machine learning algorithms may benefit from exploiting such correlations, as
they can increase predictive performance on noisy data. However, often such
correlations are not robust (e.g., they may change between domains, datasets,
or applications) and we wish to avoid exploiting them. Disentanglement methods
aim to learn representations which capture different factors of variation in
latent subspaces. A common approach involves minimizing the mutual information
between latent subspaces, such that each encodes a single underlying attribute.
However, this fails when attributes are correlated. We solve this problem by
enforcing independence between subspaces conditioned on the available
attributes, which allows us to remove only dependencies that are not due to the
correlation structure present in the training data. We achieve this via an
adversarial approach to minimize the conditional mutual information (CMI)
between subspaces with respect to categorical variables. We first show
theoretically that CMI minimization is a good objective for robust
disentanglement on linear problems with Gaussian data. We then apply our method
on real-world datasets based on MNIST and CelebA, and show that it yields
models that are disentangled and robust under correlation shift, including in
weakly supervised settings.
Related papers
- Reducing Spurious Correlation for Federated Domain Generalization [15.864230656989854]
In open-world scenarios, global models may struggle to predict well on entirely new domain data captured by certain media.
Existing methods still rely on strong statistical correlations between samples and labels to address this issue.
We introduce FedCD, an overall optimization framework at both the local and global levels.
arXiv Detail & Related papers (2024-07-27T05:06:31Z) - Generalization error of min-norm interpolators in transfer learning [2.7309692684728617]
Min-norm interpolators emerge naturally as implicit regularized limits of modern machine learning algorithms.
In many applications, a limited amount of test data may be available during training, yet properties of min-norm in this setting are not well-understood.
We establish a novel anisotropic local law to achieve these characterizations.
arXiv Detail & Related papers (2024-06-20T02:23:28Z) - Spuriousness-Aware Meta-Learning for Learning Robust Classifiers [26.544938760265136]
Spurious correlations are brittle associations between certain attributes of inputs and target variables.
Deep image classifiers often leverage them for predictions, leading to poor generalization on the data where the correlations do not hold.
Mitigating the impact of spurious correlations is crucial towards robust model generalization, but it often requires annotations of the spurious correlations in data.
arXiv Detail & Related papers (2024-06-15T21:41:25Z) - SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation [62.889835139583965]
We introduce an unsupervised auxiliary task of learning an implicit underlying surface representation simultaneously on source and target data.
As both domains share the same latent representation, the model is forced to accommodate discrepancies between the two sources of data.
Our experiments demonstrate that our method achieves a better performance than the current state of the art, both in real-to-real and synthetic-to-real scenarios.
arXiv Detail & Related papers (2023-04-06T17:36:23Z) - Direct-Effect Risk Minimization for Domain Generalization [11.105832297850188]
We introduce the concepts of direct and indirect effects from causal inference to the domain generalization problem.
We argue that models that learn direct effects minimize the worst-case risk across correlation-shifted domains.
Experiments on 5 correlation-shifted datasets and the DomainBed benchmark verify the effectiveness of our approach.
arXiv Detail & Related papers (2022-11-26T15:35:36Z) - Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning [112.69497636932955]
Federated learning aims to train models across different clients without the sharing of data for privacy considerations.
We study how data heterogeneity affects the representations of the globally aggregated models.
We propose sc FedDecorr, a novel method that can effectively mitigate dimensional collapse in federated learning.
arXiv Detail & Related papers (2022-10-01T09:04:17Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.