Local Class-Specific and Global Image-Level Generative Adversarial
Networks for Semantic-Guided Scene Generation
- URL: http://arxiv.org/abs/1912.12215v3
- Date: Tue, 31 Mar 2020 01:31:12 GMT
- Title: Local Class-Specific and Global Image-Level Generative Adversarial
Networks for Semantic-Guided Scene Generation
- Authors: Hao Tang, Dan Xu, Yan Yan, Philip H. S. Torr, Nicu Sebe
- Abstract summary: We consider learning the scene generation in a local context, and design a local class-specific generative network with semantic maps as a guidance.
To learn more discrimi class-specific feature representations for the local generation, a novel classification module is also proposed.
Experiments on two scene image generation tasks show superior generation performance of the proposed model.
- Score: 135.4660201856059
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the task of semantic-guided scene generation. One
open challenge in scene generation is the difficulty of the generation of small
objects and detailed local texture, which has been widely observed in global
image-level generation methods. To tackle this issue, in this work we consider
learning the scene generation in a local context, and correspondingly design a
local class-specific generative network with semantic maps as a guidance, which
separately constructs and learns sub-generators concentrating on the generation
of different classes, and is able to provide more scene details. To learn more
discriminative class-specific feature representations for the local generation,
a novel classification module is also proposed. To combine the advantage of
both the global image-level and the local class-specific generation, a joint
generation network is designed with an attention fusion module and a
dual-discriminator structure embedded. Extensive experiments on two scene image
generation tasks show superior generation performance of the proposed model.
The state-of-the-art results are established by large margins on both tasks and
on challenging public benchmarks. The source code and trained models are
available at https://github.com/Ha0Tang/LGGAN.
Related papers
- Rethinking The Training And Evaluation of Rich-Context Layout-to-Image Generation [44.094656220043106]
A specialized area within generative modeling is layout-to-image (L2I) generation.
We introduce a novel regional cross-attention module tailored to enrich layout-to-image generation.
We propose two metrics to assess L2I performance in open-vocabulary scenarios.
arXiv Detail & Related papers (2024-09-07T14:57:03Z) - GLoD: Composing Global Contexts and Local Details in Image Generation [0.0]
Global-Local Diffusion (textitGLoD) is a novel framework which allows simultaneous control over the global contexts and the local details.
It assigns multiple global and local prompts to corresponding layers and composes their noises to guide a denoising process.
Our framework enables complex global-local compositions, conditioning objects in the global prompt with the local prompts while preserving other unspecified identities.
arXiv Detail & Related papers (2024-04-23T18:39:57Z) - Instruct-Imagen: Image Generation with Multi-modal Instruction [90.04481955523514]
instruct-imagen is a model that tackles heterogeneous image generation tasks and generalizes across unseen tasks.
We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision.
Human evaluation on various image generation datasets reveals that instruct-imagen matches or surpasses prior task-specific models in-domain.
arXiv Detail & Related papers (2024-01-03T19:31:58Z) - Localized Text-to-Image Generation for Free via Cross Attention Control [154.06530917754515]
We show that localized generation can be achieved by simply controlling cross attention maps during inference.
Our proposed cross attention control (CAC) provides new open-vocabulary localization abilities to standard text-to-image models.
arXiv Detail & Related papers (2023-06-26T12:15:06Z) - Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators [97.12135238534628]
We propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects.
Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts.
Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks.
arXiv Detail & Related papers (2022-12-13T01:36:56Z) - Fine-Grained Object Classification via Self-Supervised Pose Alignment [42.55938966190932]
We learn a novel graph based object representation to reveal a global configuration of local parts for self-supervised pose alignment across classes.
We evaluate our method on three popular fine-grained object classification benchmarks, consistently achieving the state-of-the-art performance.
arXiv Detail & Related papers (2022-03-30T01:46:19Z) - Local and Global GANs with Semantic-Aware Upsampling for Image
Generation [201.39323496042527]
We consider generating images using local context.
We propose a class-specific generative network using semantic maps as guidance.
Lastly, we propose a novel semantic-aware upsampling method.
arXiv Detail & Related papers (2022-02-28T19:24:25Z) - Collaging Class-specific GANs for Semantic Image Synthesis [68.87294033259417]
We propose a new approach for high resolution semantic image synthesis.
It consists of one base image generator and multiple class-specific generators.
Experiments show that our approach can generate high quality images in high resolution.
arXiv Detail & Related papers (2021-10-08T17:46:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.