Related papers: Decorating Your Own Bedroom: Locally Controlling Image Generation with Generative Adversarial Networks

Decorating Your Own Bedroom: Locally Controlling Image Generation with Generative Adversarial Networks

URL: http://arxiv.org/abs/2105.08222v1
Date: Tue, 18 May 2021 01:31:49 GMT
Title: Decorating Your Own Bedroom: Locally Controlling Image Generation with Generative Adversarial Networks
Authors: Chen Zhang, Yinghao Xu, Yujun Shen
Abstract summary: We propose an effective approach, termed as LoGAN, to support local editing of the output image. We are able to seamlessly remove, insert, shift, and rotate the individual objects inside a room. Our method can completely clear out a room and then refurnish it with customized furniture and styles.
Score: 15.253043666814413
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Generative Adversarial Networks (GANs) have made great success in synthesizing high-quality images. However, how to steer the generation process of a well-trained GAN model and customize the output image is much less explored. It has been recently found that modulating the input latent code used in GANs can reasonably alter some variation factors in the output image, but such manipulation usually presents to change the entire image as a whole. In this work, we propose an effective approach, termed as LoGAN, to support local editing of the output image. Concretely, we introduce two operators, i.e., content modulation and style modulation, together with a priority mask to facilitate the precise control of the intermediate generative features. Taking bedroom synthesis as an instance, we are able to seamlessly remove, insert, shift, and rotate the individual objects inside a room. Furthermore, our method can completely clear out a room and then refurnish it with customized furniture and styles. Experimental results show the great potentials of steering the image generation of pre-trained GANs for versatile image editing.

Related papers

Generating Compositional Scenes via Text-to-image RGBA Instance Generation [82.63805151691024]
Text-to-image diffusion generative models can generate high quality images at the cost of tedious prompt engineering. We propose a novel multi-stage generation paradigm that is designed for fine-grained control, flexibility and interactivity. Our experiments show that our RGBA diffusion model is capable of generating diverse and high quality instances with precise control over object attributes.
arXiv Detail & Related papers (2024-11-16T23:44:14Z)
Streamlining Image Editing with Layered Diffusion Brushes [8.738398948669609]
Our system renders a single edit on a 512x512 image within 140 ms using a high-end consumer GPU. Our approach demonstrates efficacy across a range of tasks, including object attribute adjustments, error correction, and sequential prompt-based object placement and manipulation.
arXiv Detail & Related papers (2024-05-01T04:30:03Z)
Latent Space Editing in Transformer-Based Flow Matching [53.75073756305241]
Flow Matching with a transformer backbone offers the potential for scalable and high-quality generative modeling. We introduce an editing space, $u$-space, that can be manipulated in a controllable, accumulative, and composable manner. Lastly, we put forth a straightforward yet powerful method for achieving fine-grained and nuanced editing using text prompts.
arXiv Detail & Related papers (2023-12-17T21:49:59Z)
In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model. We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z)
Gradient Adjusting Networks for Domain Inversion [82.72289618025084]
StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing. We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights. Our experiments show a sizable gap in performance over the current state of the art in this very active domain.
arXiv Detail & Related papers (2023-02-22T14:47:57Z)
Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space. Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias. During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z)
Paint by Example: Exemplar-based Image Editing with Diffusion Models [35.84464684227222]
In this paper, we investigate exemplar-guided image editing for more precise control. We achieve this goal by leveraging self-supervised training to disentangle and re-organize the source image and the exemplar. We demonstrate that our method achieves an impressive performance and enables controllable editing on in-the-wild images with high fidelity.
arXiv Detail & Related papers (2022-11-23T18:59:52Z)
SemanticStyleGAN: Learning Compositional Generative Priors for Controllable Image Synthesis and Editing [35.02841064647306]
StyleGANs provide promising prior models for downstream tasks on image synthesis and editing. We present SemanticStyleGAN, where a generator is trained to model local semantic parts separately and synthesizes images in a compositional way.
arXiv Detail & Related papers (2021-12-04T04:17:11Z)
Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose. Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification. We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z)
Navigating the GAN Parameter Space for Semantic Image Editing [35.622710993417456]
Generative Adversarial Networks (GANs) are an indispensable tool for visual editing. In this paper, we significantly expand the range of visual effects achievable with the state-of-the-art models, like StyleGAN2.
arXiv Detail & Related papers (2020-11-27T15:38:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.