Related papers: Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance

Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance

URL: http://arxiv.org/abs/2306.04396v1
Date: Wed, 7 Jun 2023 12:56:56 GMT
Title: Improving Diffusion-based Image Translation using Asymmetric Gradient Guidance
Authors: Gihyun Kwon, Jong Chul Ye
Abstract summary: We present an approach that guides the reverse process of diffusion sampling by applying asymmetric gradient guidance. Our model's adaptability allows it to be implemented with both image-fusion and latent-dif models. Experiments show that our method outperforms various state-of-the-art models in image translation tasks.
Score: 51.188396199083336
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Diffusion models have shown significant progress in image translation tasks recently. However, due to their stochastic nature, there's often a trade-off between style transformation and content preservation. Current strategies aim to disentangle style and content, preserving the source image's structure while successfully transitioning from a source to a target domain under text or one-shot image conditions. Yet, these methods often require computationally intense fine-tuning of diffusion models or additional neural networks. To address these challenges, here we present an approach that guides the reverse process of diffusion sampling by applying asymmetric gradient guidance. This results in quicker and more stable image manipulation for both text-guided and image-guided image translation. Our model's adaptability allows it to be implemented with both image- and latent-diffusion models. Experiments show that our method outperforms various state-of-the-art models in image translation tasks.

Related papers

Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method [60.88467353578118]
We show that a fixed-point-inspired iterative approach to invert real-world images does not achieve convergence, instead oscillating between distinct clusters. We introduce a simple and fast distribution transfer technique that facilitates image enhancement, stroke-based recoloring, as well as visual prompt-guided image editing.
arXiv Detail & Related papers (2024-11-17T17:45:37Z)
ReNoise: Real Image Inversion Through Iterative Noising [62.96073631599749]
We introduce an inversion method with a high quality-to-operation ratio, enhancing reconstruction accuracy without increasing the number of operations. We evaluate the performance of our ReNoise technique using various sampling algorithms and models, including recent accelerated diffusion models.
arXiv Detail & Related papers (2024-03-21T17:52:08Z)
Real-World Image Variation by Aligning Diffusion Inversion Chain [53.772004619296794]
A domain gap exists between generated images and real-world images, which poses a challenge in generating high-quality variations of real-world images. We propose a novel inference pipeline called Real-world Image Variation by ALignment (RIVAL) Our pipeline enhances the generation quality of image variations by aligning the image generation process to the source image's inversion chain.
arXiv Detail & Related papers (2023-05-30T04:09:47Z)
Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer [38.957512116073616]
We propose a zero-shot contrastive loss for diffusion models that doesn't require additional fine-tuning or auxiliary networks. Our method can generate images with the same semantic content as the source image in a zero-shot manner.
arXiv Detail & Related papers (2023-03-15T13:47:02Z)
Cap2Aug: Caption guided Image to Image data Augmentation [41.53127698828463]
Cap2Aug is an image-to-image diffusion model-based data augmentation strategy using image captions as text prompts. We generate captions from the limited training images and using these captions edit the training images using an image-to-image stable diffusion model. This strategy generates augmented versions of images similar to the training images yet provides semantic diversity across the samples.
arXiv Detail & Related papers (2022-12-11T04:37:43Z)
Semantic-Conditional Diffusion Networks for Image Captioning [116.86677915812508]
We propose a new diffusion model based paradigm tailored for image captioning, namely Semantic-Conditional Diffusion Networks (SCD-Net) In SCD-Net, multiple Diffusion Transformer structures are stacked to progressively strengthen the output sentence with better visional-language alignment and linguistical coherence. Experiments on COCO dataset demonstrate the promising potential of using diffusion models in the challenging image captioning task.
arXiv Detail & Related papers (2022-12-06T16:08:16Z)
Diffusion-based Image Translation using Disentangled Style and Content Representation [51.188396199083336]
Diffusion-based image translation guided by semantic texts or a single target image has enabled flexible style transfer. It is often difficult to maintain the original content of the image during the reverse diffusion. We present a novel diffusion-based unsupervised image translation method using disentangled style and content representation. Our experimental results show that the proposed method outperforms state-of-the-art baseline models in both text-guided and image-guided translation tasks.
arXiv Detail & Related papers (2022-09-30T06:44:37Z)
MIDMs: Matching Interleaved Diffusion Models for Exemplar-based Image Translation [29.03892463588357]
We present a novel method for exemplar-based image translation, called matching interleaved diffusion models (MIDMs) We formulate a diffusion-based matching-and-generation framework that interleaves cross-domain matching and diffusion steps in the latent space. To improve the reliability of the diffusion process, we design a confidence-aware process using cycle-consistency to consider only confident regions.
arXiv Detail & Related papers (2022-09-22T14:43:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.