LSC-GAN: Latent Style Code Modeling for Continuous Image-to-image
Translation
- URL: http://arxiv.org/abs/2110.05052v1
- Date: Mon, 11 Oct 2021 07:46:43 GMT
- Title: LSC-GAN: Latent Style Code Modeling for Continuous Image-to-image
Translation
- Authors: Qiusheng Huang, Xueqi Hu, Li Sun and Qingli Li
- Abstract summary: This paper builds the model for I2I translation among continuous varying domains.
To deal with continuous translation, we design the editing modules, changing the latent style code along two directions.
Experiments on age and viewing angle translation show that the proposed method can achieve high-quality results.
- Score: 9.692858539011446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image-to-image (I2I) translation is usually carried out among discrete
domains. However, image domains, often corresponding to a physical value, are
usually continuous. In other words, images gradually change with the value, and
there exists no obvious gap between different domains. This paper intends to
build the model for I2I translation among continuous varying domains. We first
divide the whole domain coverage into discrete intervals, and explicitly model
the latent style code for the center of each interval. To deal with continuous
translation, we design the editing modules, changing the latent style code
along two directions. These editing modules help to constrain the codes for
domain centers during training, so that the model can better understand the
relation among them. To have diverse results, the latent style code is further
diversified with either the random noise or features from the reference image,
giving the individual style code to the decoder for label-based or
reference-based synthesis. Extensive experiments on age and viewing angle
translation show that the proposed method can achieve high-quality results, and
it is also flexible for users.
Related papers
- Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation [17.30877810859863]
Large-scale text-to-image (T2I) diffusion models have emerged as a powerful tool for image-to-image translation (I2I)
This paper proposes frequency-controlled diffusion model (FCDiffusion), an end-to-end diffusion-based framework.
arXiv Detail & Related papers (2024-07-03T11:05:19Z) - Smooth image-to-image translations with latent space interpolations [64.8170758294427]
Multi-domain image-to-image (I2I) translations can transform a source image according to the style of a target domain.
We show that our regularization techniques can improve the state-of-the-art I2I translations by a large margin.
arXiv Detail & Related papers (2022-10-03T11:57:30Z) - A Style-aware Discriminator for Controllable Image Translation [10.338078700632423]
Current image-to-image translations do not control the output domain beyond the classes used during training.
We propose a style-aware discriminator that acts as a critic as well as a style to provide conditions.
Experiments on multiple datasets verify that the proposed model outperforms current state-of-the-art image-to-image translation methods.
arXiv Detail & Related papers (2022-03-29T09:13:33Z) - Multi-domain Unsupervised Image-to-Image Translation with Appearance
Adaptive Convolution [62.4972011636884]
We propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework.
We exploit the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance.
We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-02-06T14:12:34Z) - Separating Content and Style for Unsupervised Image-to-Image Translation [20.44733685446886]
Unsupervised image-to-image translation aims to learn the mapping between two visual domains with unpaired samples.
We propose to separate the content code and style code simultaneously in a unified framework.
Based on the correlation between the latent features and the high-level domain-invariant tasks, the proposed framework demonstrates superior performance.
arXiv Detail & Related papers (2021-10-27T12:56:50Z) - ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image
Translation [55.47515538020578]
This work proposes an implicit style function (ISF) to straightforwardly achieve multi-modal and multi-domain image-to-image translation.
Our results in human face and animal manipulations show significantly improved results over the baselines.
Our model enables cost-effective multi-modal unsupervised image-to-image translations at high resolution using pre-trained unconditional GANs.
arXiv Detail & Related papers (2021-09-26T04:51:39Z) - Smoothing the Disentangled Latent Style Space for Unsupervised
Image-to-Image Translation [56.55178339375146]
Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic results.
We propose a new training protocol based on three specific losses which help a translation network to learn a smooth and disentangled latent style space.
arXiv Detail & Related papers (2021-06-16T17:58:21Z) - Unpaired Image-to-Image Translation via Latent Energy Transport [61.62293304236371]
Image-to-image translation aims to preserve source contents while translating to discriminative target styles between two visual domains.
In this paper, we propose to deploy an energy-based model (EBM) in the latent space of a pretrained autoencoder for this task.
Our model is the first to be applicable to 1024$times$1024-resolution unpaired image translation.
arXiv Detail & Related papers (2020-12-01T17:18:58Z) - Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2
Network [73.5062435623908]
We propose a new I2I translation method that generates a new model in the target domain via a series of model transformations.
By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain.
arXiv Detail & Related papers (2020-10-12T13:51:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.