S2ST: Image-to-Image Translation in the Seed Space of Latent Diffusion
- URL: http://arxiv.org/abs/2312.00116v1
- Date: Thu, 30 Nov 2023 18:59:49 GMT
- Title: S2ST: Image-to-Image Translation in the Seed Space of Latent Diffusion
- Authors: Or Greenberg, Eran Kishon, Dani Lischinski
- Abstract summary: We introduce S2ST, a novel framework designed to accomplish global I2IT in complex images.
S2ST operates within the seed space of a Latent Diffusion Model, thereby leveraging the powerful image priors learned by the latter.
We show that S2ST surpasses state-of-the-art GAN-based I2IT methods, as well as diffusion-based approaches, for complex automotive scenes.
- Score: 23.142097481682306
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Image-to-image translation (I2IT) refers to the process of transforming
images from a source domain to a target domain while maintaining a fundamental
connection in terms of image content. In the past few years, remarkable
advancements in I2IT were achieved by Generative Adversarial Networks (GANs),
which nevertheless struggle with translations requiring high precision.
Recently, Diffusion Models have established themselves as the engine of choice
for image generation. In this paper we introduce S2ST, a novel framework
designed to accomplish global I2IT in complex photorealistic images, such as
day-to-night or clear-to-rain translations of automotive scenes. S2ST operates
within the seed space of a Latent Diffusion Model, thereby leveraging the
powerful image priors learned by the latter. We show that S2ST surpasses
state-of-the-art GAN-based I2IT methods, as well as diffusion-based approaches,
for complex automotive scenes, improving fidelity while respecting the target
domain's appearance across a variety of domains. Notably, S2ST obviates the
necessity for training domain-specific translation networks.
Related papers
- An Analysis for Image-to-Image Translation and Style Transfer [7.074445137050722]
We introduce the differences and connections between image-to-image translation and style transfer.
The entire discussion process involves the concepts, forms, training modes, evaluation processes, and visualization results.
arXiv Detail & Related papers (2024-08-12T08:49:00Z) - Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation [17.30877810859863]
Large-scale text-to-image (T2I) diffusion models have emerged as a powerful tool for image-to-image translation (I2I)
This paper proposes frequency-controlled diffusion model (FCDiffusion), an end-to-end diffusion-based framework.
arXiv Detail & Related papers (2024-07-03T11:05:19Z) - Jurassic World Remake: Bringing Ancient Fossils Back to Life via
Zero-Shot Long Image-to-Image Translation [97.40572668025273]
We use text-guided latent diffusion models for zero-shot image-to-image translation (I2I) across large domain gaps.
Being able to perform translations across large domain gaps has a wide variety of real-world applications in criminology, astrology, environmental conservation, and paleontology.
arXiv Detail & Related papers (2023-08-14T17:59:31Z) - Leveraging in-domain supervision for unsupervised image-to-image
translation tasks via multi-stream generators [4.726777092009554]
We introduce two techniques to incorporate this invaluable in-domain prior knowledge for the benefit of translation quality.
We propose splitting the input data according to semantic masks, explicitly guiding the network to different behavior for the different regions of the image.
In addition, we propose training a semantic segmentation network along with the translation task, and to leverage this output as a loss term that improves robustness.
arXiv Detail & Related papers (2021-12-30T15:29:36Z) - Image-to-image Translation as a Unique Source of Knowledge [91.3755431537592]
This article performs translations of labelled datasets from the optical domain to the SAR domain with different I2I algorithms from the state-of-the-art.
stacking is proposed as a way of combining the knowledge learned from the different I2I translations and evaluated against single models.
arXiv Detail & Related papers (2021-12-03T12:12:04Z) - Aggregated Contextual Transformations for High-Resolution Image
Inpainting [57.241749273816374]
We propose an enhanced GAN-based model, named Aggregated COntextual-Transformation GAN (AOT-GAN) for high-resolution image inpainting.
To enhance context reasoning, we construct the generator of AOT-GAN by stacking multiple layers of a proposed AOT block.
For improving texture synthesis, we enhance the discriminator of AOT-GAN by training it with a tailored mask-prediction task.
arXiv Detail & Related papers (2021-04-03T15:50:17Z) - Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2
Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations.
By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z) - TIME: Text and Image Mutual-Translation Adversarial Networks [55.1298552773457]
We propose Text and Image Mutual-Translation Adversarial Networks (TIME)
TIME learns a T2I generator G and an image captioning discriminator D under the Generative Adversarial Network framework.
In experiments, TIME achieves state-of-the-art (SOTA) performance on the CUB and MS-COCO dataset.
arXiv Detail & Related papers (2020-05-27T06:40:12Z) - Domain Adaptation for Image Dehazing [72.15994735131835]
Most existing methods train a dehazing model on synthetic hazy images, which are less able to generalize well to real hazy images due to domain shift.
We propose a domain adaptation paradigm, which consists of an image translation module and two image dehazing modules.
Experimental results on both synthetic and real-world images demonstrate that our model performs favorably against the state-of-the-art dehazing algorithms.
arXiv Detail & Related papers (2020-05-10T13:54:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.