Related papers: Progressive Energy-Based Cooperative Learning for Multi-Domain Image-to-Image Translation

Progressive Energy-Based Cooperative Learning for Multi-Domain Image-to-Image Translation

URL: http://arxiv.org/abs/2306.14448v3
Date: Mon, 15 Jan 2024 07:45:02 GMT
Title: Progressive Energy-Based Cooperative Learning for Multi-Domain Image-to-Image Translation
Authors: Weinan Song, Yaxuan Zhu, Lei He, Yingnian Wu, and Jianwen Xie
Abstract summary: We study a novel energy-based cooperative learning framework for multi-domain image-to-image translation. The framework consists of four components: descriptor, translator, style encoder, and style generator.
Score: 53.682651509759744
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper studies a novel energy-based cooperative learning framework for multi-domain image-to-image translation. The framework consists of four components: descriptor, translator, style encoder, and style generator. The descriptor is a multi-head energy-based model that represents a multi-domain image distribution. The components of translator, style encoder, and style generator constitute a diversified image generator. Specifically, given an input image from a source domain, the translator turns it into a stylised output image of the target domain according to a style code, which can be inferred by the style encoder from a reference image or produced by the style generator from a random noise. Since the style generator is represented as an domain-specific distribution of style codes, the translator can provide a one-to-many transformation (i.e., diversified generation) between source domain and target domain. To train our framework, we propose a likelihood-based multi-domain cooperative learning algorithm to jointly train the multi-domain descriptor and the diversified image generator (including translator, style encoder, and style generator modules) via multi-domain MCMC teaching, in which the descriptor guides the diversified image generator to shift its probability density toward the data distribution, while the diversified image generator uses its randomly translated images to initialize the descriptor's Langevin dynamics process for efficient sampling.

Related papers

I2I-Galip: Unsupervised Medical Image Translation Using Generative Adversarial CLIP [30.506544165999564]
Unpaired image-to-image translation is a challenging task due to the absence of paired examples. We propose a new image-to-image translation framework named Image-to-Image-Generative-Adversarial-CLIP (I2I-Galip)
arXiv Detail & Related papers (2024-09-19T01:44:50Z)
ARTEMIS: Using GANs with Multiple Discriminators to Generate Art [0.0]
We propose a novel method for generating abstract art. First an autoencoder is trained to encode and decode the style representations of images, which are extracted from source images with a pretrained VGG network. The decoder component of the autoencoder is extracted and used as a generator in a GAN.
arXiv Detail & Related papers (2023-11-14T16:19:29Z)
SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial Network for an end-to-end image translation [18.93434486338439]
SCONE-GAN is shown to be effective for learning to generate realistic and diverse scenery images. For more realistic and diverse image generation we introduce style reference image. We validate the proposed algorithm for image-to-image translation and stylizing outdoor images.
arXiv Detail & Related papers (2023-11-07T10:29:16Z)
Unified Multi-Modal Latent Diffusion for Joint Subject and Text Conditional Image Generation [63.061871048769596]
We present a novel Unified Multi-Modal Latent Diffusion (UMM-Diffusion) which takes joint texts and images containing specified subjects as input sequences. To be more specific, both input texts and images are encoded into one unified multi-modal latent space. Our method is able to generate high-quality images with complex semantics from both aspects of input texts and images.
arXiv Detail & Related papers (2023-03-16T13:50:20Z)
Gradient Adjusting Networks for Domain Inversion [82.72289618025084]
StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing. We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights. Our experiments show a sizable gap in performance over the current state of the art in this very active domain.
arXiv Detail & Related papers (2023-02-22T14:47:57Z)
Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks [54.80435295622583]
One-shot generative domain adaption aims to transfer a pre-trained generator on one domain to a new domain using one reference image only. We present a novel one-shot generative domain adaption method, i.e., DiFa, for diverse generation and faithful adaptation.
arXiv Detail & Related papers (2022-07-18T16:29:41Z)
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [63.85888518950824]
We present a text-driven method that allows shifting a generative model to new domains. We show that through natural language prompts and a few minutes of training, our method can adapt a generator across a multitude of domains.
arXiv Detail & Related papers (2021-08-02T14:46:46Z)
Retrieval Guided Unsupervised Multi-domain Image-to-Image Translation [59.73535607392732]
Image to image translation aims to learn a mapping that transforms an image from one visual domain to another. We propose the use of an image retrieval system to assist the image-to-image translation task.
arXiv Detail & Related papers (2020-08-11T20:11:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.