On quantifying and improving realism of images generated with diffusion
- URL: http://arxiv.org/abs/2309.14756v1
- Date: Tue, 26 Sep 2023 08:32:55 GMT
- Title: On quantifying and improving realism of images generated with diffusion
- Authors: Yunzhuo Chen, Naveed Akhtar, Nur Al Hasan Haldar, Ajmal Mian
- Abstract summary: We propose a metric, called Image Realism Score (IRS), computed from five statistical measures of a given image.
IRS is easily usable as a measure to classify a given image as real or fake.
We experimentally establish the model- and data-agnostic nature of the proposed IRS by successfully detecting fake images generated by Stable Diffusion Model (SDM), Dalle2, Midjourney and BigGAN.
Our efforts have also led to Gen-100 dataset, which provides 1,000 samples for 100 classes generated by four high-quality models.
- Score: 50.37578424163951
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in diffusion models have led to a quantum leap in the quality
of generative visual content. However, quantification of realism of the content
is still challenging. Existing evaluation metrics, such as Inception Score and
Fr\'echet inception distance, fall short on benchmarking diffusion models due
to the versatility of the generated images. Moreover, they are not designed to
quantify realism of an individual image. This restricts their application in
forensic image analysis, which is becoming increasingly important in the
emerging era of generative models. To address that, we first propose a metric,
called Image Realism Score (IRS), computed from five statistical measures of a
given image. This non-learning based metric not only efficiently quantifies
realism of the generated images, it is readily usable as a measure to classify
a given image as real or fake. We experimentally establish the model- and
data-agnostic nature of the proposed IRS by successfully detecting fake images
generated by Stable Diffusion Model (SDM), Dalle2, Midjourney and BigGAN.
We further leverage this attribute of our metric to minimize an IRS-augmented
generative loss of SDM, and demonstrate a convenient yet considerable quality
improvement of the SDM-generated content with our modification. Our efforts
have also led to Gen-100 dataset, which provides 1,000 samples for 100 classes
generated by four high-quality models. We will release the dataset and code.
Related papers
- MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling [64.09238330331195]
We propose a novel Multi-Modal Auto-Regressive (MMAR) probabilistic modeling framework.
Unlike discretization line of method, MMAR takes in continuous-valued image tokens to avoid information loss.
We show that MMAR demonstrates much more superior performance than other joint multi-modal models.
arXiv Detail & Related papers (2024-10-14T17:57:18Z) - ImagiNet: A Multi-Content Dataset for Generalizable Synthetic Image Detection via Contrastive Learning [0.0]
Generative models produce images with a level of authenticity nearly indistinguishable from real photos and artwork.
The difficulty of identifying synthetic images leaves online media platforms vulnerable to impersonation and misinformation attempts.
We introduce ImagiNet, a high-resolution and balanced dataset for synthetic image detection.
arXiv Detail & Related papers (2024-07-29T13:57:24Z) - TC-DiffRecon: Texture coordination MRI reconstruction method based on
diffusion model and modified MF-UNet method [2.626378252978696]
We propose a novel diffusion model-based MRI reconstruction method, named TC-DiffRecon, which does not rely on a specific acceleration factor for training.
We also suggest the incorporation of the MF-UNet module, designed to enhance the quality of MRI images generated by the model.
arXiv Detail & Related papers (2024-02-17T13:09:00Z) - The Journey, Not the Destination: How Data Guides Diffusion Models [75.19694584942623]
Diffusion models trained on large datasets can synthesize photo-realistic images of remarkable quality and diversity.
We propose a framework that: (i) provides a formal notion of data attribution in the context of diffusion models, and (ii) allows us to counterfactually validate such attributions.
arXiv Detail & Related papers (2023-12-11T08:39:43Z) - ExposureDiffusion: Learning to Expose for Low-light Image Enhancement [87.08496758469835]
This work addresses the issue by seamlessly integrating a diffusion model with a physics-based exposure model.
Our method obtains significantly improved performance and reduced inference time compared with vanilla diffusion models.
The proposed framework can work with both real-paired datasets, SOTA noise models, and different backbone networks.
arXiv Detail & Related papers (2023-07-15T04:48:35Z) - Hierarchical Integration Diffusion Model for Realistic Image Deblurring [71.76410266003917]
Diffusion models (DMs) have been introduced in image deblurring and exhibited promising performance.
We propose the Hierarchical Integration Diffusion Model (HI-Diff), for realistic image deblurring.
Experiments on synthetic and real-world blur datasets demonstrate that our HI-Diff outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-05-22T12:18:20Z) - Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images [60.34381768479834]
Recent advancements in diffusion models have enabled the generation of realistic deepfakes from textual prompts in natural language.
We pioneer a systematic study on deepfake detection generated by state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-04-02T10:25:09Z) - Intriguing Property and Counterfactual Explanation of GAN for Remote Sensing Image Generation [25.96740500337747]
Generative adversarial networks (GANs) have achieved remarkable progress in the natural image field.
GAN model is more sensitive to the size of training data for RS image generation than for natural image generation.
We propose two innovative adjustment schemes, namely Uniformity Regularization (UR) and Entropy Regularization (ER), to increase the information learned by the GAN model.
arXiv Detail & Related papers (2023-03-09T13:22:50Z) - DOLCE: A Model-Based Probabilistic Diffusion Framework for Limited-Angle
CT Reconstruction [42.028139152832466]
Limited-Angle Computed Tomography (LACT) is a non-destructive evaluation technique used in a variety of applications ranging from security to medicine.
We present DOLCE, a new deep model-based framework for LACT that uses a conditional diffusion model as an image prior.
arXiv Detail & Related papers (2022-11-22T15:30:38Z) - Generative Zero-shot Network Quantization [41.75769117366117]
Convolutional neural networks are able to learn realistic image priors from numerous training samples in low-level image generation and restoration.
We show that, for high-level image recognition tasks, we can further reconstruct "realistic" images of each category by leveraging intrinsic Batch Normalization (BN) statistics without any training data.
arXiv Detail & Related papers (2021-01-21T04:10:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.