Related papers: FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

URL: http://arxiv.org/abs/2409.07451v1
Date: Wed, 11 Sep 2024 17:58:50 GMT
Title: FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process
Authors: Yang Luo, Yiheng Zhang, Zhaofan Qiu, Ting Yao, Zhineng Chen, Yu-Gang Jiang, Tao Mei,
Abstract summary: FreeEnhance is a framework for content-consistent image enhancement using off-the-shelf image diffusion models. In the noising stage, FreeEnhance is devised to add lighter noise to the region with higher frequency to preserve the high-frequent patterns in the original image. In the denoising stage, we present three target properties as constraints to regularize the predicted noise, enhancing images with high acutance and high visual quality.
Score: 120.91393949012014
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The emergence of text-to-image generation models has led to the recognition that image enhancement, performed as post-processing, would significantly improve the visual quality of the generated images. Exploring diffusion models to enhance the generated images nevertheless is not trivial and necessitates to delicately enrich plentiful details while preserving the visual appearance of key content in the original image. In this paper, we propose a novel framework, namely FreeEnhance, for content-consistent image enhancement using the off-the-shelf image diffusion models. Technically, FreeEnhance is a two-stage process that firstly adds random noise to the input image and then capitalizes on a pre-trained image diffusion model (i.e., Latent Diffusion Models) to denoise and enhance the image details. In the noising stage, FreeEnhance is devised to add lighter noise to the region with higher frequency to preserve the high-frequent patterns (e.g., edge, corner) in the original image. In the denoising stage, we present three target properties as constraints to regularize the predicted noise, enhancing images with high acutance and high visual quality. Extensive experiments conducted on the HPDv2 dataset demonstrate that our FreeEnhance outperforms the state-of-the-art image enhancement models in terms of quantitative metrics and human preference. More remarkably, FreeEnhance also shows higher human preference compared to the commercial image enhancement solution of Magnific AI.

Related papers

Noise Consistency Regularization for Improved Subject-Driven Image Synthesis [55.75426086791612]
Fine-tuning Stable Diffusion enables subject-driven image synthesis by adapting the model to generate images containing specific subjects.<n>Existing fine-tuning methods suffer from two key issues: underfitting, where the model fails to reliably capture subject identity, and overfitting, where it memorizes the subject image and reduces background diversity.<n>We propose two auxiliary consistency losses for diffusion fine-tuning. First, a prior consistency regularization loss ensures that the predicted diffusion noise for prior (non-subject) images remains consistent with that of the pretrained model, improving fidelity.
arXiv Detail & Related papers (2025-06-06T19:17:37Z)
A Simple Combination of Diffusion Models for Better Quality Trade-Offs in Image Denoising [43.44633086975204]
We propose an intuitive method for leveraging pretrained diffusion models. We then introduce our proposed Linear Combination Diffusion Denoiser. LCDD achieves state-of-the-art performance and offers controlled, well-behaved trade-offs.
arXiv Detail & Related papers (2025-03-18T19:02:19Z)
Revealing the Implicit Noise-based Imprint of Generative Models [71.94916898756684]
This paper presents a novel framework that leverages noise-based model-specific imprint for the detection task. By aggregating imprints from various generative models, imprints of future models can be extrapolated to expand training data. Our approach achieves state-of-the-art performance across three public benchmarks including GenImage, Synthbuster and Chameleon.
arXiv Detail & Related papers (2025-03-12T12:04:53Z)
Frequency-Aware Guidance for Blind Image Restoration via Diffusion Models [20.898262207229873]
Blind image restoration remains a significant challenge in low-level vision tasks. Guided diffusion models have achieved promising results in blind image restoration. We propose a novel frequency-aware guidance loss that can be integrated into various diffusion models in a plug-and-play manner.
arXiv Detail & Related papers (2024-11-19T12:18:16Z)
CasSR: Activating Image Power for Real-World Image Super-Resolution [24.152495730507823]
Cascaded diffusion for Super-Resolution, CasSR, is a novel method designed to produce highly detailed and realistic images. We develop a cascaded controllable diffusion model that aims to optimize the extraction of information from low-resolution images.
arXiv Detail & Related papers (2024-03-18T03:59:43Z)
DreamDrone: Text-to-Image Diffusion Models are Zero-shot Perpetual View Generators [56.994967294931286]
We introduce DreamDrone, a novel zero-shot and training-free pipeline for generating flythrough scenes from textual prompts. We advocate explicitly warping the intermediate latent code of the pre-trained text-to-image diffusion model for high-quality image generation and unbounded generalization ability.
arXiv Detail & Related papers (2023-12-14T08:42:26Z)
GeNIe: Generative Hard Negative Images Through Diffusion [16.619150568764262]
Recent advances in generative AI have enabled more sophisticated augmentation techniques that produce data resembling natural images. We introduce GeNIe, a novel augmentation method which leverages a latent diffusion model conditioned on a text prompt to generate challenging augmentations. Our experiments demonstrate the effectiveness of our novel augmentation method and its superior performance over the prior art.
arXiv Detail & Related papers (2023-12-05T07:34:30Z)
LDM-ISP: Enhancing Neural ISP for Low Light with Latent Diffusion Models [54.93010869546011]
We propose to leverage the pre-trained latent diffusion model to perform the neural ISP for enhancing extremely low-light images. Specifically, to tailor the pre-trained latent diffusion model to operate on the RAW domain, we train a set of lightweight taming modules. We observe different roles of UNet denoising and decoder reconstruction in the latent diffusion model, which inspires us to decompose the low-light image enhancement task into latent-space low-frequency content generation and decoding-phase high-frequency detail maintenance.
arXiv Detail & Related papers (2023-12-02T04:31:51Z)
FreePIH: Training-Free Painterly Image Harmonization with Diffusion Model [19.170302996189335]
Our FreePIH method tames the denoising process as a plug-in module for foreground image style transfer. We make use of multi-scale features to enforce the consistency of the content and stability of the foreground objects in the latent space. Our method can surpass representative baselines by large margins.
arXiv Detail & Related papers (2023-11-25T04:23:49Z)
AdaDiff: Adaptive Step Selection for Fast Diffusion [88.8198344514677]
We introduce AdaDiff, a framework designed to learn instance-specific step usage policies. AdaDiff is optimized using a policy gradient method to maximize a carefully designed reward function. Our approach achieves similar results in terms of visual quality compared to the baseline using a fixed 50 denoising steps.
arXiv Detail & Related papers (2023-11-24T11:20:38Z)
CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation [49.3016007471979]
Large generative diffusion models have revolutionized text-to-image generation and offer immense potential for conditional generation tasks. However, their widespread adoption is hindered by the high computational cost, which limits their real-time application. We introduce a novel method dubbed CoDi, that adapts a pre-trained latent diffusion model to accept additional image conditioning inputs.
arXiv Detail & Related papers (2023-10-02T17:59:18Z)
Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation. We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z)
Reconstruct-and-Generate Diffusion Model for Detail-Preserving Image Denoising [16.43285056788183]
We propose a novel approach called the Reconstruct-and-Generate Diffusion Model (RnG) Our method leverages a reconstructive denoising network to recover the majority of the underlying clean signal. It employs a diffusion algorithm to generate residual high-frequency details, thereby enhancing visual quality.
arXiv Detail & Related papers (2023-09-19T16:01:20Z)
Simultaneous Image-to-Zero and Zero-to-Noise: Diffusion Models with Analytical Image Attenuation [53.04220377034574]
We propose incorporating an analytical image attenuation process into the forward diffusion process for high-quality (un)conditioned image generation. Our method represents the forward image-to-noise mapping as simultaneous textitimage-to-zero mapping and textitzero-to-noise mapping. We have conducted experiments on unconditioned image generation, textite.g., CIFAR-10 and CelebA-HQ-256, and image-conditioned downstream tasks such as super-resolution, saliency detection, edge detection, and image inpainting.
arXiv Detail & Related papers (2023-06-23T18:08:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.