Invariant Learning via Diffusion Dreamed Distribution Shifts
- URL: http://arxiv.org/abs/2211.10370v1
- Date: Fri, 18 Nov 2022 17:07:43 GMT
- Title: Invariant Learning via Diffusion Dreamed Distribution Shifts
- Authors: Priyatham Kattakinda, Alexander Levine, Soheil Feizi
- Abstract summary: We propose a dataset called Diffusion Dreamed Distribution Shifts (D3S)
D3S consists of synthetic images generated through StableDiffusion using text prompts and image guides obtained by pasting a sample foreground image onto a background template image.
Due to the incredible photorealism of the diffusion model, our images are much closer to natural images than previous synthetic datasets.
- Score: 121.71383835729848
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Though the background is an important signal for image classification, over
reliance on it can lead to incorrect predictions when spurious correlations
between foreground and background are broken at test time. Training on a
dataset where these correlations are unbiased would lead to more robust models.
In this paper, we propose such a dataset called Diffusion Dreamed Distribution
Shifts (D3S). D3S consists of synthetic images generated through
StableDiffusion using text prompts and image guides obtained by pasting a
sample foreground image onto a background template image. Using this scalable
approach we generate 120K images of objects from all 1000 ImageNet classes in
10 diverse backgrounds. Due to the incredible photorealism of the diffusion
model, our images are much closer to natural images than previous synthetic
datasets. D3S contains a validation set of more than 17K images whose labels
are human-verified in an MTurk study. Using the validation set, we evaluate
several popular DNN image classifiers and find that the classification
performance of models generally suffers on our background diverse images. Next,
we leverage the foreground & background labels in D3S to learn a foreground
(background) representation that is invariant to changes in background
(foreground) by penalizing the mutual information between the foreground
(background) features and the background (foreground) labels. Linear
classifiers trained on these features to predict foreground (background) from
foreground (background) have high accuracies at 82.9% (93.8%), while
classifiers that predict these labels from background and foreground have a
much lower accuracy of 2.4% and 45.6% respectively. This suggests that our
foreground and background features are well disentangled. We further test the
efficacy of these representations by training classifiers on a task with strong
spurious correlations.
Related papers
- Semantic-aware Dense Representation Learning for Remote Sensing Image
Change Detection [20.761672725633936]
Training deep learning-based change detection model heavily depends on labeled data.
Recent trend is using remote sensing (RS) data to obtain in-domain representations via supervised or self-supervised learning (SSL)
We propose dense semantic-aware pre-training for RS image CD via sampling multiple class-balanced points.
arXiv Detail & Related papers (2022-05-27T06:08:33Z) - A Comprehensive Study of Image Classification Model Sensitivity to
Foregrounds, Backgrounds, and Visual Attributes [58.633364000258645]
We call this dataset RIVAL10 consisting of roughly $26k$ instances over $10$ classes.
We evaluate the sensitivity of a broad set of models to noise corruptions in foregrounds, backgrounds and attributes.
In our analysis, we consider diverse state-of-the-art architectures (ResNets, Transformers) and training procedures (CLIP, SimCLR, DeiT, Adversarial Training)
arXiv Detail & Related papers (2022-01-26T06:31:28Z) - Correlated Input-Dependent Label Noise in Large-Scale Image
Classification [4.979361059762468]
We take a principled probabilistic approach to modelling input-dependent, also known as heteroscedastic, label noise in datasets.
We demonstrate that the learned covariance structure captures known sources of label noise between semantically similar and co-occurring classes.
We set a new state-of-the-art result on WebVision 1.0 with 76.6% top-1 accuracy.
arXiv Detail & Related papers (2021-05-19T17:30:59Z) - An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human
Pose Estimation [80.02124918255059]
Semi-supervised learning aims to boost the accuracy of a model by exploring unlabeled images.
We learn two networks to mutually teach each other.
The more reliable predictions on easy images in each network are used to teach the other network to learn about the corresponding hard images.
arXiv Detail & Related papers (2020-11-25T03:29:52Z) - Background Splitting: Finding Rare Classes in a Sea of Background [55.03789745276442]
We focus on the real-world problem of training accurate deep models for image classification of a small number of rare categories.
In these scenarios, almost all images belong to the background category in the dataset (>95% of the dataset is background)
We demonstrate that both standard fine-tuning approaches and state-of-the-art approaches for training on imbalanced datasets do not produce accurate deep models in the presence of this extreme imbalance.
arXiv Detail & Related papers (2020-08-28T23:05:15Z) - Noise or Signal: The Role of Image Backgrounds in Object Recognition [93.55720207356603]
We create a toolkit for disentangling foreground and background signal on ImageNet images.
We find that (a) models can achieve non-trivial accuracy by relying on the background alone, (b) models often misclassify images even in the presence of correctly classified foregrounds.
arXiv Detail & Related papers (2020-06-17T16:54:43Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.