Related papers: LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge

LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge

URL: http://arxiv.org/abs/2501.01197v1
Date: Thu, 02 Jan 2025 11:18:25 GMT
Title: LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge
Authors: Kyoungkook Kang, Gyujin Sim, Geonung Kim, Donguk Kim, Seungho Nam, Sunghyun Cho,
Abstract summary: LayeringDiff is a novel pipeline for the synthesis of layered images.<n>By extracting layers from a composite image, rather than generating them from scratch, LayeringDiff bypasses the need for large-scale training.<n>For effective layer decomposition, we adapt a large-scale pretrained generative prior to estimate foreground and background layers.
Score: 14.481577976493236
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Layers have become indispensable tools for professional artists, allowing them to build a hierarchical structure that enables independent control over individual visual elements. In this paper, we propose LayeringDiff, a novel pipeline for the synthesis of layered images, which begins by generating a composite image using an off-the-shelf image generative model, followed by disassembling the image into its constituent foreground and background layers. By extracting layers from a composite image, rather than generating them from scratch, LayeringDiff bypasses the need for large-scale training to develop generative capabilities for individual layers. Furthermore, by utilizing a pretrained off-the-shelf generative model, our method can produce diverse contents and object scales in synthesized layers. For effective layer decomposition, we adapt a large-scale pretrained generative prior to estimate foreground and background layers. We also propose high-frequency alignment modules to refine the fine-details of the estimated layers. Our comprehensive experiments demonstrate that our approach effectively synthesizes layered images and supports various practical applications.

Related papers

DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode [47.32061459437175]
We introduce DreamLayer, a framework that enables coherent text-driven generation of multiple image layers. By explicitly modeling the relationship between transparent foreground and background layers, DreamLayer builds inter-layer connections. Experiments and user studies demonstrate that DreamLayer generates more coherent and well-aligned layers.
arXiv Detail & Related papers (2025-03-17T05:34:11Z)
LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors [38.47462111828742]
Layered content generation is crucial for creative fields like graphic design, animation, and digital art.<n>We propose a novel image generation pipeline based on Latent Diffusion Models (LDMs) that generates images with two layers.<n>We show significant improvements in visual coherence, image quality, and layer consistency compared to baseline methods.
arXiv Detail & Related papers (2024-12-05T18:59:18Z)
Generative Image Layer Decomposition with Visual Effects [49.75021036203426]
LayerDecomp is a generative framework for image layer decomposition.<n>It produces clean backgrounds and high-quality transparent foregrounds with faithfully preserved visual effects.<n>Our method achieves superior quality in layer decomposition, outperforming existing approaches in object removal and spatial editing tasks.
arXiv Detail & Related papers (2024-11-26T20:26:49Z)
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model [70.14953942532621]
Layer-collaborative diffusion model, named LayerDiff, is designed for text-guided, multi-layered, composable image synthesis. Our model can generate high-quality multi-layered images with performance comparable to conventional whole-image generation methods. LayerDiff enables a broader range of controllable generative applications, including layer-specific image editing and style transfer.
arXiv Detail & Related papers (2024-03-18T16:28:28Z)
ControlCom: Controllable Image Composition using Diffusion Model [45.48263800282992]
We propose a controllable image composition method that unifies four tasks in one diffusion model. We also propose a local enhancement module to enhance the foreground details in the diffusion model. The proposed method is evaluated on both public benchmark and real-world data.
arXiv Detail & Related papers (2023-08-19T14:56:44Z)
Text2Layer: Layered Image Generation using Latent Diffusion Model [12.902259486204898]
We propose to generate layered images from a layered image generation perspective. To achieve layered image generation, we train an autoencoder that is able to reconstruct layered images. Experimental results show that the proposed method is able to generate high-quality layered images.
arXiv Detail & Related papers (2023-07-19T06:56:07Z)
Composer: Creative and Controllable Image Synthesis with Composable Conditions [57.78533372393828]
Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability. This work offers a new generation paradigm that allows flexible control of the output image, such as spatial layout and palette, while maintaining the synthesis quality and model creativity.
arXiv Detail & Related papers (2023-02-20T05:48:41Z)
SLIDE: Single Image 3D Photography with Soft Layering and Depth-aware Inpainting [54.419266357283966]
Single image 3D photography enables viewers to view a still image from novel viewpoints. Recent approaches combine monocular depth networks with inpainting networks to achieve compelling results. We present SLIDE, a modular and unified system for single image 3D photography.
arXiv Detail & Related papers (2021-09-02T16:37:20Z)
Deep Image Compositing [93.75358242750752]
We propose a new method which can automatically generate high-quality image composites without any user input. Inspired by Laplacian pyramid blending, a dense-connected multi-stream fusion network is proposed to effectively fuse the information from the foreground and background images. Experiments show that the proposed method can automatically generate high-quality composites and outperforms existing methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-11-04T06:12:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.