Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution
- URL: http://arxiv.org/abs/2403.16643v1
- Date: Mon, 25 Mar 2024 11:29:19 GMT
- Title: Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution
- Authors: Qingping Zheng, Ling Zheng, Yuanfan Guo, Ying Li, Songcen Xu, Jiankang Deng, Hang Xu,
- Abstract summary: Artifact-free super-resolution (SR) aims to translate low-resolution images into their high-resolution counterparts with a strict integrity of the original content.
Traditional diffusion-based SR techniques are prone to artifact introduction during iterative procedures.
We propose Self-Adaptive Reality-Guided Diffusion to identify and mitigate the propagation of artifacts.
- Score: 47.29558685384506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artifact-free super-resolution (SR) aims to translate low-resolution images into their high-resolution counterparts with a strict integrity of the original content, eliminating any distortions or synthetic details. While traditional diffusion-based SR techniques have demonstrated remarkable abilities to enhance image detail, they are prone to artifact introduction during iterative procedures. Such artifacts, ranging from trivial noise to unauthentic textures, deviate from the true structure of the source image, thus challenging the integrity of the super-resolution process. In this work, we propose Self-Adaptive Reality-Guided Diffusion (SARGD), a training-free method that delves into the latent space to effectively identify and mitigate the propagation of artifacts. Our SARGD begins by using an artifact detector to identify implausible pixels, creating a binary mask that highlights artifacts. Following this, the Reality Guidance Refinement (RGR) process refines artifacts by integrating this mask with realistic latent representations, improving alignment with the original image. Nonetheless, initial realistic-latent representations from lower-quality images result in over-smoothing in the final output. To address this, we introduce a Self-Adaptive Guidance (SAG) mechanism. It dynamically computes a reality score, enhancing the sharpness of the realistic latent. These alternating mechanisms collectively achieve artifact-free super-resolution. Extensive experiments demonstrate the superiority of our method, delivering detailed artifact-free high-resolution images while reducing sampling steps by 2X. We release our code at https://github.com/ProAirVerse/Self-Adaptive-Guidance-Diffusion.git.
Related papers
- Realistic Extreme Image Rescaling via Generative Latent Space Learning [51.85790402171696]
We propose a novel framework called Latent Space Based Image Rescaling (LSBIR) for extreme image rescaling tasks.
LSBIR effectively leverages powerful natural image priors learned by a pre-trained text-to-image diffusion model to generate realistic HR images.
In the first stage, a pseudo-invertible encoder-decoder models the bidirectional mapping between the latent features of the HR image and the target-sized LR image.
In the second stage, the reconstructed features from the first stage are refined by a pre-trained diffusion model to generate more faithful and visually pleasing details.
arXiv Detail & Related papers (2024-08-17T09:51:42Z) - SSL: A Self-similarity Loss for Improving Generative Image Super-resolution [11.94842557256442]
Generative adversarial networks (GAN) and generative diffusion models (DM) have been widely used in real-world image super-resolution (Real-ISR)
These generative models are prone to generating visual artifacts and false image structures, resulting in unnatural Real-ISR results.
We propose a simple yet effective self-similarity loss (SSL) to improve the performance of generative Real-ISR models.
arXiv Detail & Related papers (2024-08-11T07:46:06Z) - One-Step Effective Diffusion Network for Real-World Image Super-Resolution [11.326598938246558]
We propose a one-step effective diffusion network, namely OSEDiff, for the Real-ISR problem.
We finetune the pre-trained diffusion network with trainable layers to adapt it to complex image degradations.
Our OSEDiff model can efficiently and effectively generate HQ images in just one diffusion step.
arXiv Detail & Related papers (2024-06-12T13:10:31Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [63.54342601757723]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - Realistic Restorer: artifact-free flow restorer(AF2R) for MRI motion
artifact removal [3.8103327351507255]
Motion artifact severely degrades image quality, reduces examination efficiency, and makes accurate diagnosis difficult.
Previous methods often relied on implicit models for artifact correction, resulting in biases in modeling the artifact formation mechanism.
We incorporate the artifact generation mechanism to reestablish the relationship between artifacts and anatomical content in the image domain.
arXiv Detail & Related papers (2023-06-19T04:02:01Z) - NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real
Image Animation [66.0838349951456]
Nerf-based Generative models have shown impressive capacity in generating high-quality images with consistent 3D geometry.
We propose a universal method to surgically fine-tune these NeRF-GAN models in order to achieve high-fidelity animation of real subjects only by a single image.
arXiv Detail & Related papers (2022-11-30T18:36:45Z) - Identifying Invariant Texture Violation for Robust Deepfake Detection [17.306386179823576]
We propose the Invariant Texture Learning framework, which only accesses the published dataset with low visual quality.
Our method is based on the prior that the microscopic facial texture of the source face is inevitably violated by the texture transferred from the target person.
arXiv Detail & Related papers (2020-12-19T03:02:15Z) - Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement [78.58603635621591]
Training an unpaired synthetic-to-real translation network in image space is severely under-constrained.
We propose a semi-supervised approach that operates on the disentangled shading and albedo layers of the image.
Our two-stage pipeline first learns to predict accurate shading in a supervised fashion using physically-based renderings as targets.
arXiv Detail & Related papers (2020-03-27T21:45:41Z) - PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of
Generative Models [77.32079593577821]
PULSE (Photo Upsampling via Latent Space Exploration) generates high-resolution, realistic images at resolutions previously unseen in the literature.
Our method outperforms state-of-the-art methods in perceptual quality at higher resolutions and scale factors than previously possible.
arXiv Detail & Related papers (2020-03-08T16:44:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.