Related papers: From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition

From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition

URL: http://arxiv.org/abs/2511.20996v1
Date: Wed, 26 Nov 2025 02:50:07 GMT
Title: From Inpainting to Layer Decomposition: Repurposing Generative Inpainting Models for Image Layer Decomposition
Authors: Jingxi Chen, Yixiao Zhang, Xiaoye Qian, Zongxia Li, Cornelia Fermuller, Caren Chen, Yiannis Aloimonos,
Abstract summary: layered representation enables independent editing of elements, offering greater flexibility for content creation.<n>We observe a strong connection between layer decomposition and in/outpainting tasks, and propose adapting a diffusion-based inpainting model for layer decomposition using lightweight finetuning.<n>To further preserve detail in the latent space, we introduce a novel multi-modal context fusion module with linear attention complexity.
Score: 16.7393689710179
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Images can be viewed as layered compositions, foreground objects over background, with potential occlusions. This layered representation enables independent editing of elements, offering greater flexibility for content creation. Despite the progress in large generative models, decomposing a single image into layers remains challenging due to limited methods and data. We observe a strong connection between layer decomposition and in/outpainting tasks, and propose adapting a diffusion-based inpainting model for layer decomposition using lightweight finetuning. To further preserve detail in the latent space, we introduce a novel multi-modal context fusion module with linear attention complexity. Our model is trained purely on a synthetic dataset constructed from open-source assets and achieves superior performance in object removal and occlusion recovery, unlocking new possibilities in downstream editing and creative applications.

Related papers

Cycle-Consistent Tuning for Layered Image Decomposition [26.331480224165364]
Disentangling visual layers in real-world images is a persistent challenge in vision and graphics.<n>We present an in-context image decomposition framework that leverages large diffusion foundation models for layered separation.<n>Our approach achieves accurate and coherent decompositions and also generalizes effectively across other decomposition types.
arXiv Detail & Related papers (2026-02-24T15:10:31Z)
Qwen-Image-Layered: Towards Inherent Editability via Layer Decomposition [73.43121650616804]
We propose textbfQwen-Image-Layered, an end-to-end diffusion model that decomposes a single RGB image into multiple semantically disentangled RGBA layers.<n>Our method significantly surpasses existing approaches in decomposition quality and establishes a new paradigm for consistent image editing.
arXiv Detail & Related papers (2025-12-17T17:12:42Z)
LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas [47.5187068545227]
We present LayerComposer, an interactive framework for personalized, multi-subject text-to-image generation.<n>The proposed layered canvas allows users to place, resize, or lock input subjects through intuitive layer manipulation.<n>Our locking mechanism requires no architectural changes, relying instead on inherent positional embeddings combined with a new complementary data sampling strategy.
arXiv Detail & Related papers (2025-10-23T17:59:55Z)
LayerD: Decomposing Raster Graphic Designs into Layers [15.294433619347082]
LayerD is a method to decompose graphic designs into layers for re-editable creative workflow.<n>We propose a simple yet effective refinement approach taking advantage of the assumption that layers often exhibit uniform appearance.<n>In experiments, we show that LayerD successfully achieves high-quality decomposition and outperforms baselines.
arXiv Detail & Related papers (2025-09-29T17:50:12Z)
DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers [85.1185656296496]
We present DiffDecompose, a diffusion Transformer-based framework that learns the posterior over possible layer decompositions conditioned on the input image.<n>The code and dataset will be available upon paper acceptance.
arXiv Detail & Related papers (2025-05-24T16:08:04Z)
DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Mode [47.32061459437175]
We introduce DreamLayer, a framework that enables coherent text-driven generation of multiple image layers.<n>By explicitly modeling the relationship between transparent foreground and background layers, DreamLayer builds inter-layer connections.<n>Experiments and user studies demonstrate that DreamLayer generates more coherent and well-aligned layers.
arXiv Detail & Related papers (2025-03-17T05:34:11Z)
LayeringDiff: Layered Image Synthesis via Generation, then Disassembly with Generative Knowledge [14.481577976493236]
LayeringDiff is a novel pipeline for the synthesis of layered images.<n>By extracting layers from a composite image, rather than generating them from scratch, LayeringDiff bypasses the need for large-scale training.<n>For effective layer decomposition, we adapt a large-scale pretrained generative prior to estimate foreground and background layers.
arXiv Detail & Related papers (2025-01-02T11:18:25Z)
Generative Image Layer Decomposition with Visual Effects [49.75021036203426]
LayerDecomp is a generative framework for image layer decomposition.<n>It produces clean backgrounds and high-quality transparent foregrounds with faithfully preserved visual effects.<n>Our method achieves superior quality in layer decomposition, outperforming existing approaches in object removal and spatial editing tasks.
arXiv Detail & Related papers (2024-11-26T20:26:49Z)
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model [70.14953942532621]
Layer-collaborative diffusion model, named LayerDiff, is designed for text-guided, multi-layered, composable image synthesis. Our model can generate high-quality multi-layered images with performance comparable to conventional whole-image generation methods. LayerDiff enables a broader range of controllable generative applications, including layer-specific image editing and style transfer.
arXiv Detail & Related papers (2024-03-18T16:28:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.