Unifying conditional and unconditional semantic image synthesis with
OCO-GAN
- URL: http://arxiv.org/abs/2211.14105v1
- Date: Fri, 25 Nov 2022 13:43:21 GMT
- Title: Unifying conditional and unconditional semantic image synthesis with
OCO-GAN
- Authors: Marl\`ene Careil, St\'ephane Lathuili\`ere, Camille Couprie, Jakob
Verbeek
- Abstract summary: We propose OCO-GAN, for Optionally COnditioned GAN, which addresses both tasks in a unified manner.
trained adversarially in an end-to-end approach with a shared discriminator, we are able to leverage the synergy between both tasks.
Our results are competitive or better than state-of-the art specialised unconditional and conditional models.
- Score: 29.77186837186815
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Generative image models have been extensively studied in recent years. In the
unconditional setting, they model the marginal distribution from unlabelled
images. To allow for more control, image synthesis can be conditioned on
semantic segmentation maps that instruct the generator the position of objects
in the image. While these two tasks are intimately related, they are generally
studied in isolation. We propose OCO-GAN, for Optionally COnditioned GAN, which
addresses both tasks in a unified manner, with a shared image synthesis network
that can be conditioned either on semantic maps or directly on latents. Trained
adversarially in an end-to-end approach with a shared discriminator, we are
able to leverage the synergy between both tasks. We experiment with Cityscapes,
COCO-Stuff, ADE20K datasets in a limited data, semi-supervised and full data
regime and obtain excellent performance, improving over existing hybrid models
that can generate both with and without conditioning in all settings. Moreover,
our results are competitive or better than state-of-the art specialised
unconditional and conditional models.
Related papers
- Comprehensive Generative Replay for Task-Incremental Segmentation with Concurrent Appearance and Semantic Forgetting [49.87694319431288]
Generalist segmentation models are increasingly favored for diverse tasks involving various objects from different image sources.
We propose a Comprehensive Generative (CGR) framework that restores appearance and semantic knowledge by synthesizing image-mask pairs.
Experiments on incremental tasks (cardiac, fundus and prostate segmentation) show its clear advantage for alleviating concurrent appearance and semantic forgetting.
arXiv Detail & Related papers (2024-06-28T10:05:58Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Controllable Image Generation via Collage Representations [31.456445433105415]
"Mixing and matching scenes" (M&Ms) is an approach that consists of an adversarially trained generative image model conditioned on appearance features and spatial positions of the different elements in a collage.
We show that M&Ms outperforms baselines in terms of fine-grained scene controllability while being very competitive in terms of image quality and sample diversity.
arXiv Detail & Related papers (2023-04-26T17:58:39Z) - Composer: Creative and Controllable Image Synthesis with Composable
Conditions [57.78533372393828]
Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.
This work offers a new generation paradigm that allows flexible control of the output image, such as spatial layout and palette, while maintaining the synthesis quality and model creativity.
arXiv Detail & Related papers (2023-02-20T05:48:41Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Co-Training for Unsupervised Domain Adaptation of Semantic Segmentation
Models [0.0]
We propose a new co-training process for synth-to-real UDA of semantic segmentation models.
Our co-training shows improvements of 15-20 percentage points of mIoU over baselines.
arXiv Detail & Related papers (2022-05-31T13:30:36Z) - Instance-Conditioned GAN [26.27527697877534]
Generative Adversarial Networks (GANs) can generate near photo realistic images in narrow domains such as human faces.
We take inspiration from kernel density estimation techniques and introduce a non-parametric approach to modeling distributions of complex datasets.
arXiv Detail & Related papers (2021-09-10T19:08:45Z) - Gradient-Induced Co-Saliency Detection [81.54194063218216]
Co-saliency detection (Co-SOD) aims to segment the common salient foreground in a group of relevant images.
In this paper, inspired by human behavior, we propose a gradient-induced co-saliency detection method.
arXiv Detail & Related papers (2020-04-28T08:40:55Z) - Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision [83.33283892171562]
Example-guided image synthesis has recently been attempted to synthesize an image from a semantic label map and an exemplary image.
In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically different from the given label map.
We propose an end-to-end network for joint global and local feature alignment and synthesis.
arXiv Detail & Related papers (2020-04-18T18:17:40Z) - Multimodal Image Synthesis with Conditional Implicit Maximum Likelihood
Estimation [54.17177006826262]
We develop a new generic conditional image synthesis method based on Implicit Maximum Likelihood Estimation (IMLE)
We demonstrate improved multimodal image synthesis performance on two tasks, single image super-resolution and image synthesis from scene layouts.
arXiv Detail & Related papers (2020-04-07T03:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.