LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge
- URL: http://arxiv.org/abs/2501.01197v1
- Date: Thu, 02 Jan 2025 11:18:25 GMT
- Title: LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge
- Authors: Kyoungkook Kang, Gyujin Sim, Geonung Kim, Donguk Kim, Seungho Nam, Sunghyun Cho,
- Abstract summary: LayeringDiff is a novel pipeline for the synthesis of layered images.<n>By extracting layers from a composite image, rather than generating them from scratch, LayeringDiff bypasses the need for large-scale training.<n>For effective layer decomposition, we adapt a large-scale pretrained generative prior to estimate foreground and background layers.
- Score: 14.481577976493236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Layers have become indispensable tools for professional artists, allowing them to build a hierarchical structure that enables independent control over individual visual elements. In this paper, we propose LayeringDiff, a novel pipeline for the synthesis of layered images, which begins by generating a composite image using an off-the-shelf image generative model, followed by disassembling the image into its constituent foreground and background layers. By extracting layers from a composite image, rather than generating them from scratch, LayeringDiff bypasses the need for large-scale training to develop generative capabilities for individual layers. Furthermore, by utilizing a pretrained off-the-shelf generative model, our method can produce diverse contents and object scales in synthesized layers. For effective layer decomposition, we adapt a large-scale pretrained generative prior to estimate foreground and background layers. We also propose high-frequency alignment modules to refine the fine-details of the estimated layers. Our comprehensive experiments demonstrate that our approach effectively synthesizes layered images and supports various practical applications.
Related papers
- OmniPSD: Layered PSD Generation with Diffusion Transformer [59.20320950128599]
We propose OmniPSD, a unified diffusion framework built upon the Flux ecosystem.<n>It enables text-to-PSD generation and image-to-PSD decomposition through in-context learning.<n>Experiments on our new RGBA-layered dataset demonstrate that OmniPSD achieves high-fidelity generation.
arXiv Detail & Related papers (2025-12-10T02:09:59Z) - From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition [16.7393689710179]
layered representation enables independent editing of elements, offering greater flexibility for content creation.<n>We observe a strong connection between layer decomposition and in/outpainting tasks, and propose adapting a diffusion-based inpainting model for layer decomposition using lightweight finetuning.<n>To further preserve detail in the latent space, we introduce a novel multi-modal context fusion module with linear attention complexity.
arXiv Detail & Related papers (2025-11-26T02:50:07Z) - LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas [47.5187068545227]
We present LayerComposer, an interactive framework for personalized, multi-subject text-to-image generation.<n>The proposed layered canvas allows users to place, resize, or lock input subjects through intuitive layer manipulation.<n>Our locking mechanism requires no architectural changes, relying instead on inherent positional embeddings combined with a new complementary data sampling strategy.
arXiv Detail & Related papers (2025-10-23T17:59:55Z) - DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers [85.1185656296496]
We present DiffDecompose, a diffusion Transformer-based framework that learns the posterior over possible layer decompositions conditioned on the input image.<n>The code and dataset will be available upon paper acceptance.
arXiv Detail & Related papers (2025-05-24T16:08:04Z) - PSDiffusion: Harmonized Multi-Layer Image Generation via Layout and Appearance Alignment [23.67447416568964]
Transparent image layer generation plays a significant role in digital art and design.<n>Existing methods typically decompose transparent layers from a single RGB image using a set of tools or generate multiple transparent layers sequentially.<n>We propose PSDiffusion, a unified diffusion framework that leverages image composition priors from pre-trained image diffusion model for simultaneous multi-layer text-to-image generation.
arXiv Detail & Related papers (2025-05-16T17:23:35Z) - DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode [47.32061459437175]
We introduce DreamLayer, a framework that enables coherent text-driven generation of multiple image layers.
By explicitly modeling the relationship between transparent foreground and background layers, DreamLayer builds inter-layer connections.
Experiments and user studies demonstrate that DreamLayer generates more coherent and well-aligned layers.
arXiv Detail & Related papers (2025-03-17T05:34:11Z) - LayerFusion: Harmonized Multi-Layer Text-to-Image Generation with Generative Priors [38.47462111828742]
Layered content generation is crucial for creative fields like graphic design, animation, and digital art.<n>We propose a novel image generation pipeline based on Latent Diffusion Models (LDMs) that generates images with two layers.<n>We show significant improvements in visual coherence, image quality, and layer consistency compared to baseline methods.
arXiv Detail & Related papers (2024-12-05T18:59:18Z) - Generative Image Layer Decomposition with Visual Effects [49.75021036203426]
LayerDecomp is a generative framework for image layer decomposition.<n>It produces clean backgrounds and high-quality transparent foregrounds with faithfully preserved visual effects.<n>Our method achieves superior quality in layer decomposition, outperforming existing approaches in object removal and spatial editing tasks.
arXiv Detail & Related papers (2024-11-26T20:26:49Z) - LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model [70.14953942532621]
Layer-collaborative diffusion model, named LayerDiff, is designed for text-guided, multi-layered, composable image synthesis.
Our model can generate high-quality multi-layered images with performance comparable to conventional whole-image generation methods.
LayerDiff enables a broader range of controllable generative applications, including layer-specific image editing and style transfer.
arXiv Detail & Related papers (2024-03-18T16:28:28Z) - ControlCom: Controllable Image Composition using Diffusion Model [45.48263800282992]
We propose a controllable image composition method that unifies four tasks in one diffusion model.
We also propose a local enhancement module to enhance the foreground details in the diffusion model.
The proposed method is evaluated on both public benchmark and real-world data.
arXiv Detail & Related papers (2023-08-19T14:56:44Z) - Text2Layer: Layered Image Generation using Latent Diffusion Model [12.902259486204898]
We propose to generate layered images from a layered image generation perspective.
To achieve layered image generation, we train an autoencoder that is able to reconstruct layered images.
Experimental results show that the proposed method is able to generate high-quality layered images.
arXiv Detail & Related papers (2023-07-19T06:56:07Z) - Composer: Creative and Controllable Image Synthesis with Composable
Conditions [57.78533372393828]
Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.
This work offers a new generation paradigm that allows flexible control of the output image, such as spatial layout and palette, while maintaining the synthesis quality and model creativity.
arXiv Detail & Related papers (2023-02-20T05:48:41Z) - SLIDE: Single Image 3D Photography with Soft Layering and Depth-aware
Inpainting [54.419266357283966]
Single image 3D photography enables viewers to view a still image from novel viewpoints.
Recent approaches combine monocular depth networks with inpainting networks to achieve compelling results.
We present SLIDE, a modular and unified system for single image 3D photography.
arXiv Detail & Related papers (2021-09-02T16:37:20Z) - Deep Image Compositing [93.75358242750752]
We propose a new method which can automatically generate high-quality image composites without any user input.
Inspired by Laplacian pyramid blending, a dense-connected multi-stream fusion network is proposed to effectively fuse the information from the foreground and background images.
Experiments show that the proposed method can automatically generate high-quality composites and outperforms existing methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-11-04T06:12:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.