Augmented Cyclic Consistency Regularization for Unpaired Image-to-Image
Translation
- URL: http://arxiv.org/abs/2003.00187v2
- Date: Mon, 12 Oct 2020 16:07:23 GMT
- Title: Augmented Cyclic Consistency Regularization for Unpaired Image-to-Image
Translation
- Authors: Takehiko Ohkawa, Naoto Inoue, Hirokatsu Kataoka, Nakamasa Inoue
- Abstract summary: Augmented Cyclic Consistency Regularization (A CCR) is a novel regularization method for unpaired I2I translation.
Our method outperforms the consistency regularized GAN (CR-GAN) in real-world translations.
- Score: 22.51574923085135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unpaired image-to-image (I2I) translation has received considerable attention
in pattern recognition and computer vision because of recent advancements in
generative adversarial networks (GANs). However, due to the lack of explicit
supervision, unpaired I2I models often fail to generate realistic images,
especially in challenging datasets with different backgrounds and poses. Hence,
stabilization is indispensable for GANs and applications of I2I translation.
Herein, we propose Augmented Cyclic Consistency Regularization (ACCR), a novel
regularization method for unpaired I2I translation. Our main idea is to enforce
consistency regularization originating from semi-supervised learning on the
discriminators leveraging real, fake, reconstructed, and augmented samples. We
regularize the discriminators to output similar predictions when fed pairs of
original and perturbed images. We qualitatively clarify why consistency
regularization on fake and reconstructed samples works well. Quantitatively,
our method outperforms the consistency regularized GAN (CR-GAN) in real-world
translations and demonstrates efficacy against several data augmentation
variants and cycle-consistent constraints.
Related papers
- UTSGAN: Unseen Transition Suss GAN for Transition-Aware Image-to-image
Translation [57.99923293611923]
We introduce a transition-aware approach to I2I translation, where the data translation mapping is explicitly parameterized with a transition variable.
We propose the use of transition consistency, defined on the transition variable, to enable regularization of consistency on unobserved translations.
Based on these insights, we present Unseen Transition Suss GAN (UTSGAN), a generative framework that constructs a manifold for the transition with a transition encoder.
arXiv Detail & Related papers (2023-04-24T09:47:34Z) - Spectral Normalization and Dual Contrastive Regularization for
Image-to-Image Translation [9.029227024451506]
We propose a new unpaired I2I translation framework based on dual contrastive regularization and spectral normalization.
We conduct comprehensive experiments to evaluate the effectiveness of SN-DCR, and the results prove that our method achieves SOTA in multiple tasks.
arXiv Detail & Related papers (2023-04-22T05:22:24Z) - Anticipating the Unseen Discrepancy for Vision and Language Navigation [63.399180481818405]
Vision-Language Navigation requires the agent to follow natural language instructions to reach a specific target.
The large discrepancy between seen and unseen environments makes it challenging for the agent to generalize well.
We propose Unseen Discrepancy Anticipating Vision and Language Navigation (DAVIS) that learns to generalize to unseen environments via encouraging test-time visual consistency.
arXiv Detail & Related papers (2022-09-10T19:04:40Z) - ResiDualGAN: Resize-Residual DualGAN for Cross-Domain Remote Sensing
Images Semantic Segmentation [15.177834801688979]
The performance of a semantic segmentation model for remote sensing (RS) images pretrained on an annotated dataset would greatly decrease when testing on another unannotated dataset because of the domain gap.
Adversarial generative methods, e.g., DualGAN, are utilized for unpaired image-to-image translation to minimize the pixel-level domain gap.
In this paper, ResiDualGAN is proposed for RS images translation, where a resizer module is used for addressing the scale discrepancy of RS datasets.
arXiv Detail & Related papers (2022-01-27T13:56:54Z) - Smoothing the Disentangled Latent Style Space for Unsupervised
Image-to-Image Translation [56.55178339375146]
Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic results.
We propose a new training protocol based on three specific losses which help a translation network to learn a smooth and disentangled latent style space.
arXiv Detail & Related papers (2021-06-16T17:58:21Z) - Generative Transition Mechanism to Image-to-Image Translation via
Encoded Transformation [40.11493448767101]
We revisit the Image-to-Image (I2I) translation problem with transition consistency.
Existing I2I translation models mainly focus on maintaining consistency on results.
We propose to enforce both result consistency and transition consistency for I2I translation.
arXiv Detail & Related papers (2021-03-09T02:56:03Z) - Structured Domain Adaptation with Online Relation Regularization for
Unsupervised Person Re-ID [62.90727103061876]
Unsupervised domain adaptation (UDA) aims at adapting the model trained on a labeled source-domain dataset to an unlabeled target-domain dataset.
We propose an end-to-end structured domain adaptation framework with an online relation-consistency regularization term.
Our proposed framework is shown to achieve state-of-the-art performance on multiple UDA tasks of person re-ID.
arXiv Detail & Related papers (2020-03-14T14:45:18Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z) - Asymmetric GANs for Image-to-Image Translation [62.49892218126542]
Existing models for Generative Adversarial Networks (GANs) learn the mapping from the source domain to the target domain using a cycle-consistency loss.
We propose an AsymmetricGAN model with both translation and reconstruction generators of unequal sizes and different parameter-sharing strategy.
Experiments on both supervised and unsupervised generative tasks with 8 datasets show that AsymmetricGAN achieves superior model capacity and better generation performance.
arXiv Detail & Related papers (2019-12-14T21:24:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.