Related papers: Dual Diffusion Implicit Bridges for Image-to-Image Translation

Dual Diffusion Implicit Bridges for Image-to-Image Translation

URL: http://arxiv.org/abs/2203.08382v1
Date: Wed, 16 Mar 2022 04:10:45 GMT
Title: Dual Diffusion Implicit Bridges for Image-to-Image Translation
Authors: Xuan Su, Jiaming Song, Chenlin Meng, Stefano Ermon
Abstract summary: Common image-to-image translation methods rely on joint training over data from both source and target domains. We present Dual Diffusion Implicit Bridges (DDIBs), an image translation method based on diffusion models. DDIBs allow translations between arbitrary pairs of source-target domains, given independently trained diffusion models on respective domains.
Score: 104.59371476415566
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Common image-to-image translation methods rely on joint training over data from both source and target domains. This excludes cases where domain data is private (e.g., in a federated setting), and often means that a new model has to be trained for a new pair of domains. We present Dual Diffusion Implicit Bridges (DDIBs), an image translation method based on diffusion models, that circumvents training on domain pairs. DDIBs allow translations between arbitrary pairs of source-target domains, given independently trained diffusion models on the respective domains. Image translation with DDIBs is a two-step process: DDIBs first obtain latent encodings for source images with the source diffusion model, and next decode such encodings using the target model to construct target images. Moreover, DDIBs enable cycle-consistency by default and is theoretically connected to optimal transport. Experimentally, we apply DDIBs on a variety of synthetic and high-resolution image datasets, demonstrating their utility in example-guided color transfer, image-to-image translation as well as their connections to optimal transport methods.

Related papers

CM-Diff: A Single Generative Network for Bidirectional Cross-Modality Translation Diffusion Model Between Infrared and Visible Images [48.57429642590462]
We present the bidirectional cross-modality translation diffusion model (CM-Diff) for simultaneously modeling data distributions in both the infrared and visible modalities.<n> Experimental results demonstrate the superiority of our CM-Diff over state-of-the-art methods.
arXiv Detail & Related papers (2025-03-12T16:25:18Z)
Bidirectional Diffusion Bridge Models [14.789137197695654]
Diffusion bridges have shown potential in paired image-to-image (I2I) translation tasks. Existing methods are limited by their unidirectional nature, requiring separate models for forward and reverse translations. We introduce the Bidirectional Diffusion Bridge Model (BDBM), a scalable approach that facilitates bidirectional translation between two coupled distributions.
arXiv Detail & Related papers (2025-02-12T04:43:02Z)
A Diffusion Model Translator for Efficient Image-to-Image Translation [60.86381807306705]
We propose an efficient method that equips a diffusion model with a lightweight translator, dubbed a Diffusion Model Translator (DMT) We evaluate our approach on a range of I2I applications, including image stylization, image colorization, segmentation to image, and sketch to image, to validate its efficacy and general utility.
arXiv Detail & Related papers (2025-02-01T04:01:24Z)
I2I-Galip: Unsupervised Medical Image Translation Using Generative Adversarial CLIP [30.506544165999564]
Unpaired image-to-image translation is a challenging task due to the absence of paired examples. We propose a new image-to-image translation framework named Image-to-Image-Generative-Adversarial-CLIP (I2I-Galip)
arXiv Detail & Related papers (2024-09-19T01:44:50Z)
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model [101.65105730838346]
We introduce Transfusion, a recipe for training a multi-modal model over discrete and continuous data. We pretrain multiple Transfusion models up to 7B parameters from scratch on a mixture of text and image data. Our experiments show that Transfusion scales significantly better than quantizing images and training a language model over discrete image tokens.
arXiv Detail & Related papers (2024-08-20T17:48:20Z)
Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation [17.30877810859863]
Large-scale text-to-image (T2I) diffusion models have emerged as a powerful tool for image-to-image translation (I2I) This paper proposes frequency-controlled diffusion model (FCDiffusion), an end-to-end diffusion-based framework.
arXiv Detail & Related papers (2024-07-03T11:05:19Z)
Rethinking Score Distillation as a Bridge Between Image Distributions [97.27476302077545]
We show that our method seeks to transport corrupted images (source) to the natural image distribution (target) Our method can be easily applied across many domains, matching or beating the performance of specialized methods. We demonstrate its utility in text-to-2D, text-based NeRF optimization, translating paintings to real images, optical illusion generation, and 3D sketch-to-real.
arXiv Detail & Related papers (2024-06-13T17:59:58Z)
Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation [6.087274577167399]
This paper introduces a novel approach to leverage the generalizability of Diffusion Models for Source-Free Domain Adaptation (DM-SFDA) Our proposed DMSFDA method involves fine-tuning a pre-trained text-to-image diffusion model to generate source domain images. We validate our approach through comprehensive experiments across a range of datasets, including Office-31, Office-Home, and VisDA.
arXiv Detail & Related papers (2024-02-07T14:56:13Z)
BBDM: Image-to-image Translation with Brownian Bridge Diffusion Models [50.39417112077254]
A novel image-to-image translation method based on the Brownian Bridge Diffusion Model (BBDM) is proposed. To the best of our knowledge, it is the first work that proposes Brownian Bridge diffusion process for image-to-image translation.
arXiv Detail & Related papers (2022-05-16T13:47:02Z)
Unpaired Image-to-Image Translation via Latent Energy Transport [61.62293304236371]
Image-to-image translation aims to preserve source contents while translating to discriminative target styles between two visual domains. In this paper, we propose to deploy an energy-based model (EBM) in the latent space of a pretrained autoencoder for this task. Our model is the first to be applicable to 1024$times$1024-resolution unpaired image translation.
arXiv Detail & Related papers (2020-12-01T17:18:58Z)
Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2 Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations. By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.