A Shared Representation for Photorealistic Driving Simulators
- URL: http://arxiv.org/abs/2112.05134v1
- Date: Thu, 9 Dec 2021 18:59:21 GMT
- Title: A Shared Representation for Photorealistic Driving Simulators
- Authors: Saeed Saadatnejad, Siyuan Li, Taylor Mordan, Alexandre Alahi
- Abstract summary: We propose to improve the quality of generated images by rethinking the discriminator architecture.
The focus is on the class of problems where images are generated given semantic inputs, such as scene segmentation maps or human body poses.
We aim to learn a shared latent representation that encodes enough information to jointly do semantic segmentation, content reconstruction, along with a coarse-to-fine grained adversarial reasoning.
- Score: 83.5985178314263
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A powerful simulator highly decreases the need for real-world tests when
training and evaluating autonomous vehicles. Data-driven simulators flourished
with the recent advancement of conditional Generative Adversarial Networks
(cGANs), providing high-fidelity images. The main challenge is synthesizing
photorealistic images while following given constraints. In this work, we
propose to improve the quality of generated images by rethinking the
discriminator architecture. The focus is on the class of problems where images
are generated given semantic inputs, such as scene segmentation maps or human
body poses. We build on successful cGAN models to propose a new
semantically-aware discriminator that better guides the generator. We aim to
learn a shared latent representation that encodes enough information to jointly
do semantic segmentation, content reconstruction, along with a coarse-to-fine
grained adversarial reasoning. The achieved improvements are generic and simple
enough to be applied to any architecture of conditional image synthesis. We
demonstrate the strength of our method on the scene, building, and human
synthesis tasks across three different datasets. The code is available at
https://github.com/vita-epfl/SemDisc.
Related papers
- Zero-Shot Detection of AI-Generated Images [54.01282123570917]
We propose a zero-shot entropy-based detector (ZED) to detect AI-generated images.
Inspired by recent works on machine-generated text detection, our idea is to measure how surprising the image under analysis is compared to a model of real images.
ZED achieves an average improvement of more than 3% over the SoTA in terms of accuracy.
arXiv Detail & Related papers (2024-09-24T08:46:13Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Joint one-sided synthetic unpaired image translation and segmentation
for colorectal cancer prevention [16.356954231068077]
We produce realistic synthetic images using a combination of 3D technologies and generative adversarial networks.
We propose CUT-seg, a joint training where a segmentation model and a generative model are jointly trained to produce realistic images.
As a part of this study we release Synth-Colon, an entirely synthetic dataset that includes 20000 realistic colon images.
arXiv Detail & Related papers (2023-07-20T22:09:04Z) - Towards Pragmatic Semantic Image Synthesis for Urban Scenes [4.36080478413575]
We present a new task: given a dataset with synthetic images and labels and a dataset with unlabeled real images, our goal is to learn a model that can generate images with the content of the input mask and the appearance of real images.
We leverage the synthetic image as a guide to the content of the generated image by penalizing the difference between their high-level features on a patch level.
In contrast to previous works which employ one discriminator that overfits the target domain semantic distribution, we employ a discriminator for the whole image and multiscale discriminators on the image patches.
arXiv Detail & Related papers (2023-05-16T18:01:12Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - StackGAN: Facial Image Generation Optimizations [0.0]
Current state-of-the-art photorealistic generators are computationally expensive, involve unstable training processes, and have real and synthetic distributions that are dissimilar in higher-dimensional spaces.
We propose a variant of the StackGAN architecture, which incorporates conditional generators to construct an image in many stages.
Our model is trained with the CelebA facial image dataset and achieved a Fr'echet Inception Distance (FID) score of 73 for edge images and a score of 59 for grayscale images generated using the synthetic edge images.
arXiv Detail & Related papers (2021-08-30T15:04:47Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - You Only Need Adversarial Supervision for Semantic Image Synthesis [84.83711654797342]
We propose a novel, simplified GAN model, which needs only adversarial supervision to achieve high quality results.
We show that images synthesized by our model are more diverse and follow the color and texture of real images more closely.
arXiv Detail & Related papers (2020-12-08T23:00:48Z) - Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image
Decomposition [67.9464567157846]
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
Our experiments confirm that a joint treatment of rendering and decomposition is indeed beneficial and that our approach outperforms state-of-the-art image-to-image translation baselines both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-06-29T12:53:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.