Semantic Palette: Guiding Scene Generation with Class Proportions
- URL: http://arxiv.org/abs/2106.01629v1
- Date: Thu, 3 Jun 2021 07:04:00 GMT
- Title: Semantic Palette: Guiding Scene Generation with Class Proportions
- Authors: Guillaume Le Moing and Tuan-Hung Vu and Himalaya Jain and Patrick
P\'erez and Matthieu Cord
- Abstract summary: We introduce a conditional framework with novel architecture designs and learning objectives, which effectively accommodates class proportions to guide the scene generation process.
Thanks to the semantic control, we can produce layouts close to the real distribution, helping enhance the whole scene generation process.
We demonstrate the merit of our approach for data augmentation: semantic segmenters trained on real layout-image pairs outperform models only trained on real pairs.
- Score: 34.746963256847145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the recent progress of generative adversarial networks (GANs) at
synthesizing photo-realistic images, producing complex urban scenes remains a
challenging problem. Previous works break down scene generation into two
consecutive phases: unconditional semantic layout synthesis and image synthesis
conditioned on layouts. In this work, we propose to condition layout generation
as well for higher semantic control: given a vector of class proportions, we
generate layouts with matching composition. To this end, we introduce a
conditional framework with novel architecture designs and learning objectives,
which effectively accommodates class proportions to guide the scene generation
process. The proposed architecture also allows partial layout editing with
interesting applications. Thanks to the semantic control, we can produce
layouts close to the real distribution, helping enhance the whole scene
generation process. On different metrics and urban scene benchmarks, our models
outperform existing baselines. Moreover, we demonstrate the merit of our
approach for data augmentation: semantic segmenters trained on real
layout-image pairs along with additional ones generated by our approach
outperform models only trained on real pairs.
Related papers
- SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation
for Novel View Synthesis from a Single Image [60.52991173059486]
We introduce SAMPLING, a Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image.
Our method demonstrates considerable performance gains in large-scale unbounded outdoor scenes using a single image on the KITTI dataset.
arXiv Detail & Related papers (2023-09-12T15:33:09Z) - LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image
Generation [121.45667242282721]
We propose a coarse-to-fine paradigm to achieve layout planning and image generation.
Our proposed method outperforms the state-of-the-art models in terms of photorealistic layout and image generation.
arXiv Detail & Related papers (2023-08-09T17:45:04Z) - Composer: Creative and Controllable Image Synthesis with Composable
Conditions [57.78533372393828]
Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.
This work offers a new generation paradigm that allows flexible control of the output image, such as spatial layout and palette, while maintaining the synthesis quality and model creativity.
arXiv Detail & Related papers (2023-02-20T05:48:41Z) - SceneComposer: Any-Level Semantic Image Synthesis [80.55876413285587]
We propose a new framework for conditional image synthesis from semantic layouts of any precision levels.
The framework naturally reduces to text-to-image (T2I) at the lowest level with no shape information, and it becomes segmentation-to-image (S2I) at the highest level.
We introduce several novel techniques to address the challenges coming with this new setup.
arXiv Detail & Related papers (2022-11-21T18:59:05Z) - Interactive Image Synthesis with Panoptic Layout Generation [14.1026819862002]
We propose Panoptic Layout Generative Adversarial Networks (PLGAN) to address this challenge.
PLGAN employs panoptic theory which distinguishes object categories between "stuff" with amorphous boundaries and "things" with well-defined shapes.
We experimentally compare our PLGAN with state-of-the-art layout-based models on the COCO-Stuff, Visual Genome, and Landscape datasets.
arXiv Detail & Related papers (2022-03-04T02:45:27Z) - Generating Novel Scene Compositions from Single Images and Videos [21.92417902229955]
We introduce SIV-GAN, an unconditional generative model that can generate new scene compositions from a single training image or a single video clip.
Compared to previous single image GANs, our model generates more diverse, higher quality images, while not being restricted to a single image setting.
arXiv Detail & Related papers (2021-03-24T17:59:07Z) - End-to-End Optimization of Scene Layout [56.80294778746068]
We propose an end-to-end variational generative model for scene layout synthesis conditioned on scene graphs.
We use scene graphs as an abstract but general representation to guide the synthesis of diverse scene layouts.
arXiv Detail & Related papers (2020-07-23T01:35:55Z) - Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision [83.33283892171562]
Example-guided image synthesis has recently been attempted to synthesize an image from a semantic label map and an exemplary image.
In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically different from the given label map.
We propose an end-to-end network for joint global and local feature alignment and synthesis.
arXiv Detail & Related papers (2020-04-18T18:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.