Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers?
- URL: http://arxiv.org/abs/2405.18029v1
- Date: Tue, 28 May 2024 10:25:06 GMT
- Title: Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers?
- Authors: Zebin You, Xinyu Zhang, Hanzhong Guo, Jingdong Wang, Chongxuan Li,
- Abstract summary: We show that, in the eyes of classifiers parameterized by neural networks, the strongest diffusion models are still far from this goal.
Our comprehensive empirical study suggests that, unlike humans, classifiers tend to classify images through edge and high-frequency components.
- Score: 39.31679737754048
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ultimate goal of generative models is to characterize the data distribution perfectly. For image generation, common metrics of visual quality (e.g., FID), and the truthlikeness of generated images to the human eyes seem to suggest that we are close to achieving it. However, through distribution classification tasks, we find that, in the eyes of classifiers parameterized by neural networks, the strongest diffusion models are still far from this goal. Specifically, classifiers consistently and effortlessly distinguish between real and generated images in various settings. Further, we observe an intriguing discrepancy: classifiers can identify differences between diffusion models with similar performance (e.g., U-ViT-H vs. DiT-XL), but struggle to differentiate between the smallest and largest models in the same family (e.g., EDM2-XS vs. EDM2-XXL), whereas humans exhibit the opposite tendency. As an explanation, our comprehensive empirical study suggests that, unlike humans, classifiers tend to classify images through edge and high-frequency components. We believe that our methodology can serve as a probe to understand how generative models work and inspire further thought on how existing models can be improved and how the abuse of such models can be prevented.
Related papers
- Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes.
Deep generative models, including diffusion models, are biased towards classes with abundant training images.
We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z) - Do text-free diffusion models learn discriminative visual
representations? [43.05419164830729]
We explore the possibility of a unified representation learner: a model which addresses both families of tasks simultaneously.
We develop diffusion models, a state-of-the-art method for generative tasks, as a prime candidate.
We find that diffusion models are better than GANs, and, with our fusion and feedback mechanisms, can compete with state-of-the-art unsupervised image representation learning methods for discriminative tasks.
arXiv Detail & Related papers (2023-11-29T18:59:59Z) - Intriguing properties of generative classifiers [14.57861413242093]
We build on advances in generative modeling that turn text-to-image models into classifiers.
They show a record-breaking human-like shape bias (99% for Imagen), near human-level out-of-distribution accuracy, state-of-the-art alignment with human classification errors.
Our results indicate that while the current dominant paradigm for modeling human object recognition is discriminative inference, zero-shot generative models approximate human object recognition data surprisingly well.
arXiv Detail & Related papers (2023-09-28T18:19:40Z) - Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification.
Our generative approach to classification attains strong results on a variety of benchmarks.
Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z) - Diffusion Models as Artists: Are we Closing the Gap between Humans and
Machines? [4.802758600019422]
We adapt the 'diversity vs. recognizability' scoring framework from Boutin et al, 2022.
We find that one-shot diffusion models have indeed started to close the gap between humans and machines.
arXiv Detail & Related papers (2023-01-27T14:08:15Z) - Traditional Classification Neural Networks are Good Generators: They are
Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models.
We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images.
We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z) - Diffusion Visual Counterfactual Explanations [51.077318228247925]
Visual Counterfactual Explanations (VCEs) are an important tool to understand the decisions of an image.
Current approaches for the generation of VCEs are restricted to adversarially robust models and often contain non-realistic artefacts.
In this paper, we overcome this by generating Visual Diffusion Counterfactual Explanations (DVCEs) for arbitrary ImageNet classifiers.
arXiv Detail & Related papers (2022-10-21T09:35:47Z) - Counterfactual Generative Networks [59.080843365828756]
We propose to decompose the image generation process into independent causal mechanisms that we train without direct supervision.
By exploiting appropriate inductive biases, these mechanisms disentangle object shape, object texture, and background.
We show that the counterfactual images can improve out-of-distribution with a marginal drop in performance on the original classification task.
arXiv Detail & Related papers (2021-01-15T10:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.