On the Role of Receptive Field in Unsupervised Sim-to-Real Image
Translation
- URL: http://arxiv.org/abs/2001.09257v1
- Date: Sat, 25 Jan 2020 03:02:12 GMT
- Title: On the Role of Receptive Field in Unsupervised Sim-to-Real Image
Translation
- Authors: Nikita Jaipuria, Shubh Gupta, Praveen Narayanan, Vidya N. Murali
- Abstract summary: Generative Adversarial Networks (GANs) are widely used for photo-realistic image synthesis.
GANs are susceptible to failure in semantic content retention as the image is translated from one domain to the other.
This paper investigates the role of the discriminator's receptive field in GANs for unsupervised image-to-image translation with mismatched data.
- Score: 4.664495510551647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) are now widely used for
photo-realistic image synthesis. In applications where a simulated image needs
to be translated into a realistic image (sim-to-real), GANs trained on unpaired
data from the two domains are susceptible to failure in semantic content
retention as the image is translated from one domain to the other. This failure
mode is more pronounced in cases where the real data lacks content diversity,
resulting in a content \emph{mismatch} between the two domains - a situation
often encountered in real-world deployment. In this paper, we investigate the
role of the discriminator's receptive field in GANs for unsupervised
image-to-image translation with mismatched data, and study its effect on
semantic content retention. Experiments with the discriminator architecture of
a state-of-the-art coupled Variational Auto-Encoder (VAE) - GAN model on
diverse, mismatched datasets show that the discriminator receptive field is
directly correlated with semantic content discrepancy of the generated image.
Related papers
- WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization [63.98650220772378]
We present WIDIn, Wording Images for Domain-Invariant representation, to disentangle discriminative visual representation.
We first estimate the language embedding with fine-grained alignment, which can be used to adaptively identify and then remove domain-specific counterpart.
We show that WIDIn can be applied to both pretrained vision-language models like CLIP, and separately trained uni-modal models like MoCo and BERT.
arXiv Detail & Related papers (2024-05-28T17:46:27Z) - StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation [18.213286385769525]
CycleGAN-based methods are known to hide the mismatched information in the generated images to bypass cycle consistency objectives.
We introduce StegoGAN, a novel model that leverages steganography to prevent spurious features in generated images.
Our approach enhances the semantic consistency of the translated images without requiring additional postprocessing or supervision.
arXiv Detail & Related papers (2024-03-29T12:23:58Z) - Wavelet-based Unsupervised Label-to-Image Translation [9.339522647331334]
We propose a new Unsupervised paradigm for SIS (USIS) that makes use of a self-supervised segmentation loss and whole image wavelet based discrimination.
We test our methodology on 3 challenging datasets and demonstrate its ability to bridge the performance gap between paired and unpaired models.
arXiv Detail & Related papers (2023-05-16T17:48:44Z) - Unsupervised Domain Adaptation for Semantic Segmentation using One-shot
Image-to-Image Translation via Latent Representation Mixing [9.118706387430883]
We propose a new unsupervised domain adaptation method for the semantic segmentation of very high resolution images.
An image-to-image translation paradigm is proposed, based on an encoder-decoder principle where latent content representations are mixed across domains.
Cross-city comparative experiments have shown that the proposed method outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2022-12-07T18:16:17Z) - Marginal Contrastive Correspondence for Guided Image Generation [58.0605433671196]
Exemplar-based image translation establishes dense correspondences between a conditional input and an exemplar from two different domains.
Existing work builds the cross-domain correspondences implicitly by minimizing feature-wise distances across the two domains.
We design a Marginal Contrastive Learning Network (MCL-Net) that explores contrastive learning to learn domain-invariant features for realistic exemplar-based image translation.
arXiv Detail & Related papers (2022-04-01T13:55:44Z) - Image-to-image Translation as a Unique Source of Knowledge [91.3755431537592]
This article performs translations of labelled datasets from the optical domain to the SAR domain with different I2I algorithms from the state-of-the-art.
stacking is proposed as a way of combining the knowledge learned from the different I2I translations and evaluated against single models.
arXiv Detail & Related papers (2021-12-03T12:12:04Z) - Smoothing the Disentangled Latent Style Space for Unsupervised
Image-to-Image Translation [56.55178339375146]
Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic results.
We propose a new training protocol based on three specific losses which help a translation network to learn a smooth and disentangled latent style space.
arXiv Detail & Related papers (2021-06-16T17:58:21Z) - Semantically Adaptive Image-to-image Translation for Domain Adaptation
of Semantic Segmentation [1.8275108630751844]
We address the problem of domain adaptation for semantic segmentation of street scenes.
Many state-of-the-art approaches focus on translating the source image while imposing that the result should be semantically consistent with the input.
We advocate that the image semantics can also be exploited to guide the translation algorithm.
arXiv Detail & Related papers (2020-09-02T16:16:50Z) - Phase Consistent Ecological Domain Adaptation [76.75730500201536]
We focus on the task of semantic segmentation, where annotated synthetic data are aplenty, but annotating real data is laborious.
The first criterion, inspired by visual psychophysics, is that the map between the two image domains be phase-preserving.
The second criterion aims to leverage ecological statistics, or regularities in the scene which are manifest in any image of it, regardless of the characteristics of the illuminant or the imaging sensor.
arXiv Detail & Related papers (2020-04-10T06:58:03Z) - CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [119.45667331836583]
Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another.
We present a novel pixel-wise adversarial domain adaptation algorithm.
arXiv Detail & Related papers (2020-01-09T19:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.