Related papers: Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers?

Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers?

URL: http://arxiv.org/abs/2405.18029v1
Date: Tue, 28 May 2024 10:25:06 GMT
Title: Are Image Distributions Indistinguishable to Humans Indistinguishable to Classifiers?
Authors: Zebin You, Xinyu Zhang, Hanzhong Guo, Jingdong Wang, Chongxuan Li,
Abstract summary: We show that, in the eyes of classifiers parameterized by neural networks, the strongest diffusion models are still far from this goal. Our comprehensive empirical study suggests that, unlike humans, classifiers tend to classify images through edge and high-frequency components.
Score: 39.31679737754048
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The ultimate goal of generative models is to characterize the data distribution perfectly. For image generation, common metrics of visual quality (e.g., FID), and the truthlikeness of generated images to the human eyes seem to suggest that we are close to achieving it. However, through distribution classification tasks, we find that, in the eyes of classifiers parameterized by neural networks, the strongest diffusion models are still far from this goal. Specifically, classifiers consistently and effortlessly distinguish between real and generated images in various settings. Further, we observe an intriguing discrepancy: classifiers can identify differences between diffusion models with similar performance (e.g., U-ViT-H vs. DiT-XL), but struggle to differentiate between the smallest and largest models in the same family (e.g., EDM2-XS vs. EDM2-XXL), whereas humans exhibit the opposite tendency. As an explanation, our comprehensive empirical study suggests that, unlike humans, classifiers tend to classify images through edge and high-frequency components. We believe that our methodology can serve as a probe to understand how generative models work and inspire further thought on how existing models can be improved and how the abuse of such models can be prevented.

Related papers

DreamDA: Generative Data Augmentation with Diffusion Models [68.22440150419003]
This paper proposes a new classification-oriented framework DreamDA. DreamDA generates diverse samples that adhere to the original data distribution by considering training images in the original data as seeds. In addition, since the labels of the generated data may not align with the labels of their corresponding seed images, we introduce a self-training paradigm for generating pseudo labels.
arXiv Detail & Related papers (2024-03-19T15:04:35Z)
Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes. Deep generative models, including diffusion models, are biased towards classes with abundant training images. We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z)
Learned representation-guided diffusion models for large-image generation [58.192263311786824]
We introduce a novel approach that trains diffusion models conditioned on embeddings from self-supervised learning (SSL) Our diffusion models successfully project these features back to high-quality histopathology and remote sensing images. Augmenting real data by generating variations of real images improves downstream accuracy for patch-level and larger, image-scale classification tasks.
arXiv Detail & Related papers (2023-12-12T14:45:45Z)
DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability [75.9781362556431]
We propose DiffDis to unify the cross-modal generative and discriminative pretraining into one single framework under the diffusion process. We show that DiffDis outperforms single-task models on both the image generation and the image-text discriminative tasks.
arXiv Detail & Related papers (2023-08-18T05:03:48Z)
InfoDiffusion: Representation Learning Using Information Maximizing Diffusion Models [35.566528358691336]
InfoDiffusion is an algorithm that augments diffusion models with low-dimensional latent variables. InfoDiffusion relies on a learning objective regularized with the mutual information between observed and hidden variables. We find that InfoDiffusion learns disentangled and human-interpretable latent representations that are competitive with state-of-the-art generative and contrastive methods.
arXiv Detail & Related papers (2023-06-14T21:48:38Z)
Intriguing properties of synthetic images: from generative adversarial networks to diffusion models [19.448196464632]
It is important to gain insight into which image features better discriminate fake images from real ones. In this paper we report on our systematic study of a large number of image generators of different families, aimed at discovering the most forensically relevant characteristics of real and generated images.
arXiv Detail & Related papers (2023-04-13T11:13:19Z)
Your Diffusion Model is Secretly a Zero-Shot Classifier [90.40799216880342]
We show that density estimates from large-scale text-to-image diffusion models can be leveraged to perform zero-shot classification. Our generative approach to classification attains strong results on a variety of benchmarks. Our results are a step toward using generative over discriminative models for downstream tasks.
arXiv Detail & Related papers (2023-03-28T17:59:56Z)
Traditional Classification Neural Networks are Good Generators: They are Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models. We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images. We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z)
Diffusion Visual Counterfactual Explanations [51.077318228247925]
Visual Counterfactual Explanations (VCEs) are an important tool to understand the decisions of an image. Current approaches for the generation of VCEs are restricted to adversarially robust models and often contain non-realistic artefacts. In this paper, we overcome this by generating Visual Diffusion Counterfactual Explanations (DVCEs) for arbitrary ImageNet classifiers.
arXiv Detail & Related papers (2022-10-21T09:35:47Z)
Fast Unsupervised Brain Anomaly Detection and Segmentation with Diffusion Models [1.6352599467675781]
We propose a method based on diffusion models to detect and segment anomalies in brain imaging. Our diffusion models achieve competitive performance compared with autoregressive approaches across a series of experiments with 2D CT and MRI data.
arXiv Detail & Related papers (2022-06-07T17:30:43Z)
Characterizing and Avoiding Problematic Global Optima of Variational Autoencoders [28.36260646471421]
Variational Auto-encoders (VAEs) are deep generative latent variable models. Recent work shows that traditional training methods tend to yield solutions that violate desiderata. We show that both issues stem from the fact that the global optima of the VAE training objective often correspond to undesirable solutions.
arXiv Detail & Related papers (2020-03-17T15:14:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.