Progressive Energy-Based Cooperative Learning for Multi-Domain
Image-to-Image Translation
- URL: http://arxiv.org/abs/2306.14448v3
- Date: Mon, 15 Jan 2024 07:45:02 GMT
- Title: Progressive Energy-Based Cooperative Learning for Multi-Domain
Image-to-Image Translation
- Authors: Weinan Song, Yaxuan Zhu, Lei He, Yingnian Wu, and Jianwen Xie
- Abstract summary: We study a novel energy-based cooperative learning framework for multi-domain image-to-image translation.
The framework consists of four components: descriptor, translator, style encoder, and style generator.
- Score: 53.682651509759744
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper studies a novel energy-based cooperative learning framework for
multi-domain image-to-image translation. The framework consists of four
components: descriptor, translator, style encoder, and style generator. The
descriptor is a multi-head energy-based model that represents a multi-domain
image distribution. The components of translator, style encoder, and style
generator constitute a diversified image generator. Specifically, given an
input image from a source domain, the translator turns it into a stylised
output image of the target domain according to a style code, which can be
inferred by the style encoder from a reference image or produced by the style
generator from a random noise. Since the style generator is represented as an
domain-specific distribution of style codes, the translator can provide a
one-to-many transformation (i.e., diversified generation) between source domain
and target domain. To train our framework, we propose a likelihood-based
multi-domain cooperative learning algorithm to jointly train the multi-domain
descriptor and the diversified image generator (including translator, style
encoder, and style generator modules) via multi-domain MCMC teaching, in which
the descriptor guides the diversified image generator to shift its probability
density toward the data distribution, while the diversified image generator
uses its randomly translated images to initialize the descriptor's Langevin
dynamics process for efficient sampling.
Related papers
- I2I-Galip: Unsupervised Medical Image Translation Using Generative Adversarial CLIP [30.506544165999564]
Unpaired image-to-image translation is a challenging task due to the absence of paired examples.
We propose a new image-to-image translation framework named Image-to-Image-Generative-Adversarial-CLIP (I2I-Galip)
arXiv Detail & Related papers (2024-09-19T01:44:50Z) - ARTEMIS: Using GANs with Multiple Discriminators to Generate Art [0.0]
We propose a novel method for generating abstract art.
First an autoencoder is trained to encode and decode the style representations of images, which are extracted from source images with a pretrained VGG network.
The decoder component of the autoencoder is extracted and used as a generator in a GAN.
arXiv Detail & Related papers (2023-11-14T16:19:29Z) - SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial
Network for an end-to-end image translation [18.93434486338439]
SCONE-GAN is shown to be effective for learning to generate realistic and diverse scenery images.
For more realistic and diverse image generation we introduce style reference image.
We validate the proposed algorithm for image-to-image translation and stylizing outdoor images.
arXiv Detail & Related papers (2023-11-07T10:29:16Z) - Unified Multi-Modal Latent Diffusion for Joint Subject and Text
Conditional Image Generation [63.061871048769596]
We present a novel Unified Multi-Modal Latent Diffusion (UMM-Diffusion) which takes joint texts and images containing specified subjects as input sequences.
To be more specific, both input texts and images are encoded into one unified multi-modal latent space.
Our method is able to generate high-quality images with complex semantics from both aspects of input texts and images.
arXiv Detail & Related papers (2023-03-16T13:50:20Z) - Gradient Adjusting Networks for Domain Inversion [82.72289618025084]
StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing.
We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights.
Our experiments show a sizable gap in performance over the current state of the art in this very active domain.
arXiv Detail & Related papers (2023-02-22T14:47:57Z) - Towards Diverse and Faithful One-shot Adaption of Generative Adversarial
Networks [54.80435295622583]
One-shot generative domain adaption aims to transfer a pre-trained generator on one domain to a new domain using one reference image only.
We present a novel one-shot generative domain adaption method, i.e., DiFa, for diverse generation and faithful adaptation.
arXiv Detail & Related papers (2022-07-18T16:29:41Z) - StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [63.85888518950824]
We present a text-driven method that allows shifting a generative model to new domains.
We show that through natural language prompts and a few minutes of training, our method can adapt a generator across a multitude of domains.
arXiv Detail & Related papers (2021-08-02T14:46:46Z) - Retrieval Guided Unsupervised Multi-domain Image-to-Image Translation [59.73535607392732]
Image to image translation aims to learn a mapping that transforms an image from one visual domain to another.
We propose the use of an image retrieval system to assist the image-to-image translation task.
arXiv Detail & Related papers (2020-08-11T20:11:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.