Seed-to-Seed: Image Translation in Diffusion Seed Space
- URL: http://arxiv.org/abs/2409.00654v1
- Date: Sun, 1 Sep 2024 08:07:59 GMT
- Title: Seed-to-Seed: Image Translation in Diffusion Seed Space
- Authors: Or Greenberg, Eran Kishon, Dani Lischinski,
- Abstract summary: We introduce Seed-to-Seed Translation (StS), a novel approach for Image-to-Image Translation using diffusion models (DMs)
We leverage the semantic information encoded within the space of inverted seeds of a pretrained DM, dubbed as the seed-space.
Our approach offers a fresh perspective on leveraging the semantic information encoded within the seed-space of pretrained DMs for effective image editing and manipulation.
- Score: 20.590890565046074
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We introduce Seed-to-Seed Translation (StS), a novel approach for Image-to-Image Translation using diffusion models (DMs), aimed at translations that require close adherence to the structure of the source image. In contrast to existing methods that modify images during the diffusion sampling process, we leverage the semantic information encoded within the space of inverted seeds of a pretrained DM, dubbed as the seed-space. We demonstrate that inverted seeds can be used for discriminative tasks, and can also be manipulated to achieve desired transformations in an unpaired image-to-image translation setting. Our method involves training an sts-GAN, an unpaired translation model between source and target seeds, based on CycleGAN. The final translated images are obtained by initiating the DM's sampling process from the translated seeds. A ControlNet is used to ensure the structural preservation of the input image. We demonstrate the effectiveness of our approach for the task of translating automotive scenes, showcasing superior performance compared to existing GAN-based and diffusion-based methods, as well as for several other unpaired image translation tasks. Our approach offers a fresh perspective on leveraging the semantic information encoded within the seed-space of pretrained DMs for effective image editing and manipulation.
Related papers
- StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation [18.213286385769525]
CycleGAN-based methods are known to hide the mismatched information in the generated images to bypass cycle consistency objectives.
We introduce StegoGAN, a novel model that leverages steganography to prevent spurious features in generated images.
Our approach enhances the semantic consistency of the translated images without requiring additional postprocessing or supervision.
arXiv Detail & Related papers (2024-03-29T12:23:58Z) - Contrastive Denoising Score for Text-guided Latent Diffusion Image Editing [58.48890547818074]
We present a powerful modification of Contrastive Denoising Score (CUT) for latent diffusion models (LDM)
Our approach enables zero-shot imageto-image translation and neural field (NeRF) editing, achieving structural correspondence between the input and output.
arXiv Detail & Related papers (2023-11-30T15:06:10Z) - Masked Discriminators for Content-Consistent Unpaired Image-to-Image
Translation [1.3654846342364308]
A common goal of unpaired image-to-image translation is to preserve content consistency between source images and translated images.
We show that masking the inputs of a global discriminator for both domains with a content-based mask is sufficient to reduce content inconsistencies significantly.
In our experiments, we show that our method achieves state-of-the-art performance in photorealistic sim-to-real translation and weather translation.
arXiv Detail & Related papers (2023-09-22T21:32:07Z) - Diffusion-based Image Translation with Label Guidance for Domain
Adaptive Semantic Segmentation [35.44771460784343]
Translating images from a source domain to a target domain for learning target models is one of the most common strategies in domain adaptive semantic segmentation (DASS)
Existing methods still struggle to preserve semantically-consistent local details between the original and translated images.
We present an innovative approach that addresses this challenge by using source-domain labels as explicit guidance during image translation.
arXiv Detail & Related papers (2023-08-23T18:01:01Z) - Improving Diffusion-based Image Translation using Asymmetric Gradient
Guidance [51.188396199083336]
We present an approach that guides the reverse process of diffusion sampling by applying asymmetric gradient guidance.
Our model's adaptability allows it to be implemented with both image-fusion and latent-dif models.
Experiments show that our method outperforms various state-of-the-art models in image translation tasks.
arXiv Detail & Related papers (2023-06-07T12:56:56Z) - Conditional Score Guidance for Text-Driven Image-to-Image Translation [52.73564644268749]
We present a novel algorithm for text-driven image-to-image translation based on a pretrained text-to-image diffusion model.
Our method aims to generate a target image by selectively editing the regions of interest in a source image.
arXiv Detail & Related papers (2023-05-29T10:48:34Z) - Diffusion-based Image Translation using Disentangled Style and Content
Representation [51.188396199083336]
Diffusion-based image translation guided by semantic texts or a single target image has enabled flexible style transfer.
It is often difficult to maintain the original content of the image during the reverse diffusion.
We present a novel diffusion-based unsupervised image translation method using disentangled style and content representation.
Our experimental results show that the proposed method outperforms state-of-the-art baseline models in both text-guided and image-guided translation tasks.
arXiv Detail & Related papers (2022-09-30T06:44:37Z) - Unbalanced Feature Transport for Exemplar-based Image Translation [51.54421432912801]
This paper presents a general image translation framework that incorporates optimal transport for feature alignment between conditional inputs and style exemplars in image translation.
We show that our method achieves superior image translation qualitatively and quantitatively as compared with the state-of-the-art.
arXiv Detail & Related papers (2021-06-19T12:07:48Z) - Unpaired Image-to-Image Translation via Latent Energy Transport [61.62293304236371]
Image-to-image translation aims to preserve source contents while translating to discriminative target styles between two visual domains.
In this paper, we propose to deploy an energy-based model (EBM) in the latent space of a pretrained autoencoder for this task.
Our model is the first to be applicable to 1024$times$1024-resolution unpaired image translation.
arXiv Detail & Related papers (2020-12-01T17:18:58Z) - GAIT: Gradient Adjusted Unsupervised Image-to-Image Translation [5.076419064097734]
An adversarial loss is utilized to match the distributions of the translated and target image sets.
This may create artifacts if two domains have different marginal distributions, for example, in uniform areas.
We propose an unsupervised IIT that preserves the uniform regions after the translation.
arXiv Detail & Related papers (2020-09-02T08:04:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.