Related papers: AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style

AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style

URL: http://arxiv.org/abs/2103.13722v1
Date: Thu, 25 Mar 2021 10:09:45 GMT
Title: AttrLostGAN: Attribute Controlled Image Synthesis from Reconfigurable Layout and Style
Authors: Stanislav Frolov, Avneesh Sharma, J\"orn Hees, Tushar Karayil, Federico Raue, Andreas Dengel
Abstract summary: We propose a method for attribute controlled image synthesis from layout. We extend a state-of-the-art approach for layout-to-image generation to condition individual objects on attributes. Our results show that our method can successfully control the fine-grained details of individual objects when modelling complex scenes with multiple objects.
Score: 5.912209564607099
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Conditional image synthesis from layout has recently attracted much interest. Previous approaches condition the generator on object locations as well as class labels but lack fine-grained control over the diverse appearance aspects of individual objects. Gaining control over the image generation process is fundamental to build practical applications with a user-friendly interface. In this paper, we propose a method for attribute controlled image synthesis from layout which allows to specify the appearance of individual objects without affecting the rest of the image. We extend a state-of-the-art approach for layout-to-image generation to additionally condition individual objects on attributes. We create and experiment on a synthetic, as well as the challenging Visual Genome dataset. Our qualitative and quantitative results show that our method can successfully control the fine-grained details of individual objects when modelling complex scenes with multiple objects.

Related papers

Generating Compositional Scenes via Text-to-image RGBA Instance Generation [82.63805151691024]
Text-to-image diffusion generative models can generate high quality images at the cost of tedious prompt engineering. We propose a novel multi-stage generation paradigm that is designed for fine-grained control, flexibility and interactivity. Our experiments show that our RGBA diffusion model is capable of generating diverse and high quality instances with precise control over object attributes.
arXiv Detail & Related papers (2024-11-16T23:44:14Z)
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects [20.978091381109294]
We propose a method to generate articulated objects from a single image. Our method generates an articulated object that is visually consistent with the input image. Our experiments show that our method outperforms the state-of-the-art in articulated object creation.
arXiv Detail & Related papers (2024-10-21T20:41:32Z)
GroundingBooth: Grounding Text-to-Image Customization [17.185571339157075]
We introduce GroundingBooth, a framework that achieves zero-shot instance-level spatial grounding on both foreground subjects and background objects. Our proposed text-image grounding module and masked cross-attention layer allow us to generate personalized images with both accurate layout alignment and identity preservation.
arXiv Detail & Related papers (2024-09-13T03:40:58Z)
CustomNet: Zero-shot Object Customization with Variable-Viewpoints in Text-to-Image Diffusion Models [85.69959024572363]
CustomNet is a novel object customization approach that explicitly incorporates 3D novel view synthesis capabilities into the object customization process. We introduce delicate designs to enable location control and flexible background control through textual descriptions or specific user-defined images. Our method facilitates zero-shot object customization without test-time optimization, offering simultaneous control over the viewpoints, location, and background.
arXiv Detail & Related papers (2023-10-30T17:50:14Z)
Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models [55.04969603431266]
This paper proposes a method for generating images of customized objects specified by users. The method is based on a general framework that bypasses the lengthy optimization required by previous approaches. We demonstrate through experiments that our proposed method is able to synthesize images with compelling output quality, appearance diversity, and object fidelity.
arXiv Detail & Related papers (2023-04-05T17:59:32Z)
Context-Aware Layout to Image Generation with Enhanced Object Appearance [123.62597976732948]
A layout to image (L2I) generation model aims to generate a complicated image containing multiple objects (things) against natural background (stuff) Existing L2I models have made great progress, but object-to-object and object-to-stuff relations are often broken. We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators.
arXiv Detail & Related papers (2021-03-22T14:43:25Z)
Attribute-guided image generation from layout [38.817023543020134]
We propose a new image generation method that enables instance-level attribute control. Experiments on Visual Genome dataset demonstrate our model's capacity to control object-level attributes in generated images. The generated images from our model have higher resolution, object classification accuracy and consistency, as compared to the previous state-of-the-art.
arXiv Detail & Related papers (2020-08-27T06:22:14Z)
Generating Person Images with Appearance-aware Pose Stylizer [66.44220388377596]
We present a novel end-to-end framework to generate realistic person images based on given person poses and appearances. The core of our framework is a novel generator called Appearance-aware Pose Stylizer (APS) which generates human images by coupling the target pose with the conditioned person appearance progressively.
arXiv Detail & Related papers (2020-07-17T15:58:05Z)
Learning Layout and Style Reconfigurable GANs for Controllable Image Synthesis [12.449076001538552]
This paper focuses on a recent emerged task, layout-to-image, to learn generative models capable of synthesizing photo-realistic images from spatial layout. Style control at the image level is the same as in vanilla GANs, while style control at the object mask level is realized by a proposed novel feature normalization scheme. In experiments, the proposed method is tested in the COCO-Stuff dataset and the Visual Genome dataset with state-of-the-art performance obtained.
arXiv Detail & Related papers (2020-03-25T18:16:05Z)
Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects. Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity. We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.