A Style-aware Discriminator for Controllable Image Translation
- URL: http://arxiv.org/abs/2203.15375v1
- Date: Tue, 29 Mar 2022 09:13:33 GMT
- Title: A Style-aware Discriminator for Controllable Image Translation
- Authors: Kunhee Kim, Sanghun Park, Eunyeong Jeon, Taehun Kim, Daijin Kim
- Abstract summary: Current image-to-image translations do not control the output domain beyond the classes used during training.
We propose a style-aware discriminator that acts as a critic as well as a style to provide conditions.
Experiments on multiple datasets verify that the proposed model outperforms current state-of-the-art image-to-image translation methods.
- Score: 10.338078700632423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current image-to-image translations do not control the output domain beyond
the classes used during training, nor do they interpolate between different
domains well, leading to implausible results. This limitation largely arises
because labels do not consider the semantic distance. To mitigate such
problems, we propose a style-aware discriminator that acts as a critic as well
as a style encoder to provide conditions. The style-aware discriminator learns
a controllable style space using prototype-based self-supervised learning and
simultaneously guides the generator. Experiments on multiple datasets verify
that the proposed model outperforms current state-of-the-art image-to-image
translation methods. In contrast with current methods, the proposed approach
supports various applications, including style interpolation, content
transplantation, and local image translation.
Related papers
- Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning [71.14084801851381]
Change captioning aims to succinctly describe the semantic change between a pair of similar images.
Most existing methods directly capture the difference between them, which risk obtaining error-prone difference features.
We propose a distractors-immune representation learning network that correlates the corresponding channels of two image representations.
arXiv Detail & Related papers (2024-07-16T13:00:33Z) - DSI2I: Dense Style for Unpaired Image-to-Image Translation [70.93865212275412]
Unpaired exemplar-based image-to-image (UEI2I) translation aims to translate a source image to a target image domain with the style of a target image exemplar.
We propose to represent style as a dense feature map, allowing for a finer-grained transfer to the source image without requiring any external semantic information.
Our results show that the translations produced by our approach are more diverse, preserve the source content better, and are closer to the exemplars when compared to the state-of-the-art methods.
arXiv Detail & Related papers (2022-12-26T18:45:25Z) - Separating Content and Style for Unsupervised Image-to-Image Translation [20.44733685446886]
Unsupervised image-to-image translation aims to learn the mapping between two visual domains with unpaired samples.
We propose to separate the content code and style code simultaneously in a unified framework.
Based on the correlation between the latent features and the high-level domain-invariant tasks, the proposed framework demonstrates superior performance.
arXiv Detail & Related papers (2021-10-27T12:56:50Z) - LSC-GAN: Latent Style Code Modeling for Continuous Image-to-image
Translation [9.692858539011446]
This paper builds the model for I2I translation among continuous varying domains.
To deal with continuous translation, we design the editing modules, changing the latent style code along two directions.
Experiments on age and viewing angle translation show that the proposed method can achieve high-quality results.
arXiv Detail & Related papers (2021-10-11T07:46:43Z) - ISF-GAN: An Implicit Style Function for High-Resolution Image-to-Image
Translation [55.47515538020578]
This work proposes an implicit style function (ISF) to straightforwardly achieve multi-modal and multi-domain image-to-image translation.
Our results in human face and animal manipulations show significantly improved results over the baselines.
Our model enables cost-effective multi-modal unsupervised image-to-image translations at high resolution using pre-trained unconditional GANs.
arXiv Detail & Related papers (2021-09-26T04:51:39Z) - Smoothing the Disentangled Latent Style Space for Unsupervised
Image-to-Image Translation [56.55178339375146]
Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic results.
We propose a new training protocol based on three specific losses which help a translation network to learn a smooth and disentangled latent style space.
arXiv Detail & Related papers (2021-06-16T17:58:21Z) - Contrastive Learning for Unsupervised Image-to-Image Translation [10.091669091440396]
We propose an unsupervised image-to-image translation method based on contrastive learning.
We randomly sample a pair of images and train the generator to change the appearance of one towards another while keeping the original structure.
Experimental results show that our method outperforms the leading unsupervised baselines in terms of visual quality and translation accuracy.
arXiv Detail & Related papers (2021-05-07T08:43:38Z) - Contrastive Learning for Unpaired Image-to-Image Translation [64.47477071705866]
In image-to-image translation, each patch in the output should reflect the content of the corresponding patch in the input, independent of domain.
We propose a framework based on contrastive learning to maximize mutual information between the two.
We demonstrate that our framework enables one-sided translation in the unpaired image-to-image translation setting, while improving quality and reducing training time.
arXiv Detail & Related papers (2020-07-30T17:59:58Z) - Toward Zero-Shot Unsupervised Image-to-Image Translation [34.51633300727676]
We propose a zero-shot unsupervised image-to-image translation framework.
We introduce two strategies for exploiting the space spanned by the semantic attributes.
Our framework can be applied to many tasks, such as zero-shot classification and fashion design.
arXiv Detail & Related papers (2020-07-28T08:13:18Z) - COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content
Conditioned Style Encoder [70.23358875904891]
Unsupervised image-to-image translation aims to learn a mapping of an image in a given domain to an analogous image in a different domain.
We propose a new few-shot image translation model, COCO-FUNIT, which computes the style embedding of the example images conditioned on the input image.
Our model shows effectiveness in addressing the content loss problem.
arXiv Detail & Related papers (2020-07-15T02:01:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.