Dual Diffusion Implicit Bridges for Image-to-Image Translation
- URL: http://arxiv.org/abs/2203.08382v1
- Date: Wed, 16 Mar 2022 04:10:45 GMT
- Title: Dual Diffusion Implicit Bridges for Image-to-Image Translation
- Authors: Xuan Su, Jiaming Song, Chenlin Meng, Stefano Ermon
- Abstract summary: Common image-to-image translation methods rely on joint training over data from both source and target domains.
We present Dual Diffusion Implicit Bridges (DDIBs), an image translation method based on diffusion models.
DDIBs allow translations between arbitrary pairs of source-target domains, given independently trained diffusion models on respective domains.
- Score: 104.59371476415566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Common image-to-image translation methods rely on joint training over data
from both source and target domains. This excludes cases where domain data is
private (e.g., in a federated setting), and often means that a new model has to
be trained for a new pair of domains. We present Dual Diffusion Implicit
Bridges (DDIBs), an image translation method based on diffusion models, that
circumvents training on domain pairs. DDIBs allow translations between
arbitrary pairs of source-target domains, given independently trained diffusion
models on the respective domains. Image translation with DDIBs is a two-step
process: DDIBs first obtain latent encodings for source images with the source
diffusion model, and next decode such encodings using the target model to
construct target images. Moreover, DDIBs enable cycle-consistency by default
and is theoretically connected to optimal transport. Experimentally, we apply
DDIBs on a variety of synthetic and high-resolution image datasets,
demonstrating their utility in example-guided color transfer, image-to-image
translation as well as their connections to optimal transport methods.
Related papers
- I2I-Galip: Unsupervised Medical Image Translation Using Generative Adversarial CLIP [30.506544165999564]
Unpaired image-to-image translation is a challenging task due to the absence of paired examples.
We propose a new image-to-image translation framework named Image-to-Image-Generative-Adversarial-CLIP (I2I-Galip)
arXiv Detail & Related papers (2024-09-19T01:44:50Z) - Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model [101.65105730838346]
We introduce Transfusion, a recipe for training a multi-modal model over discrete and continuous data.
We pretrain multiple Transfusion models up to 7B parameters from scratch on a mixture of text and image data.
Our experiments show that Transfusion scales significantly better than quantizing images and training a language model over discrete image tokens.
arXiv Detail & Related papers (2024-08-20T17:48:20Z) - Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation [17.30877810859863]
Large-scale text-to-image (T2I) diffusion models have emerged as a powerful tool for image-to-image translation (I2I)
This paper proposes frequency-controlled diffusion model (FCDiffusion), an end-to-end diffusion-based framework.
arXiv Detail & Related papers (2024-07-03T11:05:19Z) - Rethinking Score Distillation as a Bridge Between Image Distributions [97.27476302077545]
We show that our method seeks to transport corrupted images (source) to the natural image distribution (target)
Our method can be easily applied across many domains, matching or beating the performance of specialized methods.
We demonstrate its utility in text-to-2D, text-based NeRF optimization, translating paintings to real images, optical illusion generation, and 3D sketch-to-real.
arXiv Detail & Related papers (2024-06-13T17:59:58Z) - Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation [6.087274577167399]
This paper introduces a novel approach to leverage the generalizability of Diffusion Models for Source-Free Domain Adaptation (DM-SFDA)
Our proposed DMSFDA method involves fine-tuning a pre-trained text-to-image diffusion model to generate source domain images.
We validate our approach through comprehensive experiments across a range of datasets, including Office-31, Office-Home, and VisDA.
arXiv Detail & Related papers (2024-02-07T14:56:13Z) - BBDM: Image-to-image Translation with Brownian Bridge Diffusion Models [50.39417112077254]
A novel image-to-image translation method based on the Brownian Bridge Diffusion Model (BBDM) is proposed.
To the best of our knowledge, it is the first work that proposes Brownian Bridge diffusion process for image-to-image translation.
arXiv Detail & Related papers (2022-05-16T13:47:02Z) - Unpaired Image-to-Image Translation via Latent Energy Transport [61.62293304236371]
Image-to-image translation aims to preserve source contents while translating to discriminative target styles between two visual domains.
In this paper, we propose to deploy an energy-based model (EBM) in the latent space of a pretrained autoencoder for this task.
Our model is the first to be applicable to 1024$times$1024-resolution unpaired image translation.
arXiv Detail & Related papers (2020-12-01T17:18:58Z) - Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2
Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations.
By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.