Related papers: On the Role of Receptive Field in Unsupervised Sim-to-Real Image Translation

On the Role of Receptive Field in Unsupervised Sim-to-Real Image Translation

URL: http://arxiv.org/abs/2001.09257v1
Date: Sat, 25 Jan 2020 03:02:12 GMT
Title: On the Role of Receptive Field in Unsupervised Sim-to-Real Image Translation
Authors: Nikita Jaipuria, Shubh Gupta, Praveen Narayanan, Vidya N. Murali
Abstract summary: Generative Adversarial Networks (GANs) are widely used for photo-realistic image synthesis. GANs are susceptible to failure in semantic content retention as the image is translated from one domain to the other. This paper investigates the role of the discriminator's receptive field in GANs for unsupervised image-to-image translation with mismatched data.
Score: 4.664495510551647
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative Adversarial Networks (GANs) are now widely used for photo-realistic image synthesis. In applications where a simulated image needs to be translated into a realistic image (sim-to-real), GANs trained on unpaired data from the two domains are susceptible to failure in semantic content retention as the image is translated from one domain to the other. This failure mode is more pronounced in cases where the real data lacks content diversity, resulting in a content \emph{mismatch} between the two domains - a situation often encountered in real-world deployment. In this paper, we investigate the role of the discriminator's receptive field in GANs for unsupervised image-to-image translation with mismatched data, and study its effect on semantic content retention. Experiments with the discriminator architecture of a state-of-the-art coupled Variational Auto-Encoder (VAE) - GAN model on diverse, mismatched datasets show that the discriminator receptive field is directly correlated with semantic content discrepancy of the generated image.

Related papers

WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization [63.98650220772378]
We present WIDIn, Wording Images for Domain-Invariant representation, to disentangle discriminative visual representation. We first estimate the language embedding with fine-grained alignment, which can be used to adaptively identify and then remove domain-specific counterpart. We show that WIDIn can be applied to both pretrained vision-language models like CLIP, and separately trained uni-modal models like MoCo and BERT.
arXiv Detail & Related papers (2024-05-28T17:46:27Z)
StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation [18.213286385769525]
CycleGAN-based methods are known to hide the mismatched information in the generated images to bypass cycle consistency objectives. We introduce StegoGAN, a novel model that leverages steganography to prevent spurious features in generated images. Our approach enhances the semantic consistency of the translated images without requiring additional postprocessing or supervision.
arXiv Detail & Related papers (2024-03-29T12:23:58Z)
Wavelet-based Unsupervised Label-to-Image Translation [9.339522647331334]
We propose a new Unsupervised paradigm for SIS (USIS) that makes use of a self-supervised segmentation loss and whole image wavelet based discrimination. We test our methodology on 3 challenging datasets and demonstrate its ability to bridge the performance gap between paired and unpaired models.
arXiv Detail & Related papers (2023-05-16T17:48:44Z)
Unsupervised Domain Adaptation for Semantic Segmentation using One-shot Image-to-Image Translation via Latent Representation Mixing [9.118706387430883]
We propose a new unsupervised domain adaptation method for the semantic segmentation of very high resolution images. An image-to-image translation paradigm is proposed, based on an encoder-decoder principle where latent content representations are mixed across domains. Cross-city comparative experiments have shown that the proposed method outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2022-12-07T18:16:17Z)
Marginal Contrastive Correspondence for Guided Image Generation [58.0605433671196]
Exemplar-based image translation establishes dense correspondences between a conditional input and an exemplar from two different domains. Existing work builds the cross-domain correspondences implicitly by minimizing feature-wise distances across the two domains. We design a Marginal Contrastive Learning Network (MCL-Net) that explores contrastive learning to learn domain-invariant features for realistic exemplar-based image translation.
arXiv Detail & Related papers (2022-04-01T13:55:44Z)
Image-to-image Translation as a Unique Source of Knowledge [91.3755431537592]
This article performs translations of labelled datasets from the optical domain to the SAR domain with different I2I algorithms from the state-of-the-art. stacking is proposed as a way of combining the knowledge learned from the different I2I translations and evaluated against single models.
arXiv Detail & Related papers (2021-12-03T12:12:04Z)
Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation [56.55178339375146]
Image-to-Image (I2I) multi-domain translation models are usually evaluated also using the quality of their semantic results. We propose a new training protocol based on three specific losses which help a translation network to learn a smooth and disentangled latent style space.
arXiv Detail & Related papers (2021-06-16T17:58:21Z)
Semantically Adaptive Image-to-image Translation for Domain Adaptation of Semantic Segmentation [1.8275108630751844]
We address the problem of domain adaptation for semantic segmentation of street scenes. Many state-of-the-art approaches focus on translating the source image while imposing that the result should be semantically consistent with the input. We advocate that the image semantics can also be exploited to guide the translation algorithm.
arXiv Detail & Related papers (2020-09-02T16:16:50Z)
Phase Consistent Ecological Domain Adaptation [76.75730500201536]
We focus on the task of semantic segmentation, where annotated synthetic data are aplenty, but annotating real data is laborious. The first criterion, inspired by visual psychophysics, is that the map between the two image domains be phase-preserving. The second criterion aims to leverage ecological statistics, or regularities in the scene which are manifest in any image of it, regardless of the characteristics of the illuminant or the imaging sensor.
arXiv Detail & Related papers (2020-04-10T06:58:03Z)
CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [119.45667331836583]
Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another. We present a novel pixel-wise adversarial domain adaptation algorithm.
arXiv Detail & Related papers (2020-01-09T19:00:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.