Related papers: Attribute-guided image generation from layout

Attribute-guided image generation from layout

URL: http://arxiv.org/abs/2008.11932v1
Date: Thu, 27 Aug 2020 06:22:14 GMT
Title: Attribute-guided image generation from layout
Authors: Ke Ma, Bo Zhao, Leonid Sigal
Abstract summary: We propose a new image generation method that enables instance-level attribute control. Experiments on Visual Genome dataset demonstrate our model's capacity to control object-level attributes in generated images. The generated images from our model have higher resolution, object classification accuracy and consistency, as compared to the previous state-of-the-art.
Score: 38.817023543020134
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent approaches have achieved great success in image generation from structured inputs, e.g., semantic segmentation, scene graph or layout. Although these methods allow specification of objects and their locations at image-level, they lack the fidelity and semantic control to specify visual appearance of these objects at an instance-level. To address this limitation, we propose a new image generation method that enables instance-level attribute control. Specifically, the input to our attribute-guided generative model is a tuple that contains: (1) object bounding boxes, (2) object categories and (3) an (optional) set of attributes for each object. The output is a generated image where the requested objects are in the desired locations and have prescribed attributes. Several losses work collaboratively to encourage accurate, consistent and diverse image generation. Experiments on Visual Genome dataset demonstrate our model's capacity to control object-level attributes in generated images, and validate plausibility of disentangled object-attribute representation in the image generation from layout task. Also, the generated images from our model have higher resolution, object classification accuracy and consistency, as compared to the previous state-of-the-art.

Related papers

SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects [20.978091381109294]
We propose a method to generate articulated objects from a single image. Our method generates an articulated object that is visually consistent with the input image. Our experiments show that our method outperforms the state-of-the-art in articulated object creation.
arXiv Detail & Related papers (2024-10-21T20:41:32Z)
Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generation [10.416673784744281]
We propose a weighted-merge method to merge multiple reference image features into corresponding objects. Our method achieves superior performance to the state-of-the-arts on the Concept101 dataset and DreamBooth dataset of multi-object personalized image generation.
arXiv Detail & Related papers (2024-09-26T15:04:13Z)
Diffusion Self-Guidance for Controllable Image Generation [106.59989386924136]
Self-guidance provides greater control over generated images by guiding the internal representations of diffusion models. We show how a simple set of properties can be composed to perform challenging image manipulations. We also show that self-guidance can be used to edit real images.
arXiv Detail & Related papers (2023-06-01T17:59:56Z)
Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators [97.12135238534628]
We propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects. Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts. Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks.
arXiv Detail & Related papers (2022-12-13T01:36:56Z)
AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style [5.912209564607099]
We propose a method for attribute controlled image synthesis from layout. We extend a state-of-the-art approach for layout-to-image generation to condition individual objects on attributes. Our results show that our method can successfully control the fine-grained details of individual objects when modelling complex scenes with multiple objects.
arXiv Detail & Related papers (2021-03-25T10:09:45Z)
Context-Aware Layout to Image Generation with Enhanced Object Appearance [123.62597976732948]
A layout to image (L2I) generation model aims to generate a complicated image containing multiple objects (things) against natural background (stuff) Existing L2I models have made great progress, but object-to-object and object-to-stuff relations are often broken. We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators.
arXiv Detail & Related papers (2021-03-22T14:43:25Z)
Learning Object Detection from Captions via Textual Scene Attributes [70.90708863394902]
We argue that captions contain much richer information about the image, including attributes of objects and their relations. We present a method that uses the attributes in this "textual scene graph" to train object detectors. We empirically demonstrate that the resulting model achieves state-of-the-art results on several challenging object detection datasets.
arXiv Detail & Related papers (2020-09-30T10:59:20Z)
Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects. Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity. We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z)
MulGAN: Facial Attribute Editing by Exemplar [2.272764591035106]
Methods encode attribute-related information in images into the predefined region of the latent feature space by employing a pair of images with opposite attributes as input to train model. They suffer from three limitations: (1) the model must be trained using a pair of images with opposite attributes as input; (2) weak capability of editing multiple attributes by exemplars; and (3) poor quality of generating image.
arXiv Detail & Related papers (2019-12-28T04:02:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.