Related papers: LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization

LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization

URL: http://arxiv.org/abs/2505.23740v1
Date: Thu, 29 May 2025 17:58:03 GMT
Title: LayerPeeler: Autoregressive Peeling for Layer-wise Image Vectorization
Authors: Ronghuan Wu, Wanchao Su, Jing Liao,
Abstract summary: We introduce LayerPeeler, a novel layer-wise image vectorization approach.<n>By identifying and removing the topmost non-occluded layers, we generate vector graphics with complete paths and coherent layer structures.<n>Our method leverages vision-language models to construct a layer graph that captures relationships among elements.
Score: 14.917583676464266
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Image vectorization is a powerful technique that converts raster images into vector graphics, enabling enhanced flexibility and interactivity. However, popular image vectorization tools struggle with occluded regions, producing incomplete or fragmented shapes that hinder editability. While recent advancements have explored rule-based and data-driven layer-wise image vectorization, these methods face limitations in vectorization quality and flexibility. In this paper, we introduce LayerPeeler, a novel layer-wise image vectorization approach that addresses these challenges through a progressive simplification paradigm. The key to LayerPeeler's success lies in its autoregressive peeling strategy: by identifying and removing the topmost non-occluded layers while recovering underlying content, we generate vector graphics with complete paths and coherent layer structures. Our method leverages vision-language models to construct a layer graph that captures occlusion relationships among elements, enabling precise detection and description for non-occluded layers. These descriptive captions are used as editing instructions for a finetuned image diffusion model to remove the identified layers. To ensure accurate removal, we employ localized attention control that precisely guides the model to target regions while faithfully preserving the surrounding content. To support this, we contribute a large-scale dataset specifically designed for layer peeling tasks. Extensive quantitative and qualitative experiments demonstrate that LayerPeeler significantly outperforms existing techniques, producing vectorization results with superior path semantics, geometric regularity, and visual fidelity.

Related papers

MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues [106.02577891104079]
We propose MagicQuill V2, a novel system that introduces a textbflayered composition paradigm to generative image editing.<n>Our method deconstructs creative intent into a stack of controllable visual cues.
arXiv Detail & Related papers (2025-12-02T18:59:58Z)
Illustrator's Depth: Monocular Layer Index Prediction for Image Decomposition [55.8308608221966]
We introduce Illustrator's Depth, a novel definition of depth that addresses a key challenge in digital content creation: decomposing flat images into editable, ordered layers.<n>Inspired by an artist's compositional process, illustrator's depth infers a layer index to each pixel, forming an interpretable image decomposition.
arXiv Detail & Related papers (2025-11-21T17:56:43Z)
LayerD: Decomposing Raster Graphic Designs into Layers [15.294433619347082]
LayerD is a method to decompose graphic designs into layers for re-editable creative workflow.<n>We propose a simple yet effective refinement approach taking advantage of the assumption that layers often exhibit uniform appearance.<n>In experiments, we show that LayerD successfully achieves high-quality decomposition and outperforms baselines.
arXiv Detail & Related papers (2025-09-29T17:50:12Z)
ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation [108.69315278353932]
We introduce the Anonymous Region Transformer (ART), which facilitates the direct generation of variable multi-layer transparent images.<n>By enabling precise control and scalable layer generation, ART establishes a new paradigm for interactive content creation.
arXiv Detail & Related papers (2025-02-25T16:57:04Z)
Generative Image Layer Decomposition with Visual Effects [49.75021036203426]
LayerDecomp is a generative framework for image layer decomposition.<n>It produces clean backgrounds and high-quality transparent foregrounds with faithfully preserved visual effects.<n>Our method achieves superior quality in layer decomposition, outperforming existing approaches in object removal and spatial editing tasks.
arXiv Detail & Related papers (2024-11-26T20:26:49Z)
Stable Flow: Vital Layers for Training-Free Image Editing [74.52248787189302]
Diffusion models have revolutionized the field of content synthesis and editing.<n>Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT)<n>We propose an automatic method to identify "vital layers" within DiT, crucial for image formation.<n>Next, to enable real-image editing, we introduce an improved image inversion method for flow models.
arXiv Detail & Related papers (2024-11-21T18:59:51Z)
DeepIcon: A Hierarchical Network for Layer-wise Icon Vectorization [12.82009632507056]
Recent learning-based methods for converting images to vector formats frequently suffer from incomplete shapes, redundant path prediction, and a lack of accuracy in preserving the semantics of the original content. We present DeepIcon, a novel hierarchical image vectorization network specifically tailored generating variable-length icon graphics based on the image input.
arXiv Detail & Related papers (2024-10-21T08:20:19Z)
Segmentation-guided Layer-wise Image Vectorization with Gradient Fills [6.037332707968933]
We propose a segmentation-guided vectorization framework to convert images into concise vector graphics with gradient fills. With the guidance of an embedded gradient-aware segmentation, our approach progressively appends gradient-filled B'ezier paths to the output.
arXiv Detail & Related papers (2024-08-28T12:08:25Z)
Layered Image Vectorization via Semantic Simplification [45.55066618943338]
This work presents a progressive image vectorization technique that reconstructs the image as layer-wise vectors from semantic-aligned macro structures to finer details.<n>Our approach introduces a new image simplification method leveraging the feature-average effect in the Score Distillation Sampling mechanism.<n>The resulting vectors are layered and well-aligned with the target image's explicit and implicit semantic structures.
arXiv Detail & Related papers (2024-06-08T08:54:35Z)
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model [70.14953942532621]
Layer-collaborative diffusion model, named LayerDiff, is designed for text-guided, multi-layered, composable image synthesis. Our model can generate high-quality multi-layered images with performance comparable to conventional whole-image generation methods. LayerDiff enables a broader range of controllable generative applications, including layer-specific image editing and style transfer.
arXiv Detail & Related papers (2024-03-18T16:28:28Z)
Parallax-Tolerant Unsupervised Deep Image Stitching [57.76737888499145]
We propose UDIS++, a parallax-tolerant unsupervised deep image stitching technique. First, we propose a robust and flexible warp to model the image registration from global homography to local thin-plate spline motion. To further eliminate the parallax artifacts, we propose to composite the stitched image seamlessly by unsupervised learning for seam-driven composition masks.
arXiv Detail & Related papers (2023-02-16T10:40:55Z)
Unsupervised Structure-Consistent Image-to-Image Translation [6.282068591820945]
The Swapping Autoencoder achieved state-of-the-art performance in deep image manipulation and image-to-image translation. We improve this work by introducing a simple yet effective auxiliary module based on gradient reversal layers. The auxiliary module's loss forces the generator to learn to reconstruct an image with an all-zero texture code.
arXiv Detail & Related papers (2022-08-24T13:47:15Z)
Learning to See Through Obstructions with Layered Decomposition [117.77024641706451]
We present a learning-based approach for removing unwanted obstructions from moving images. Our method leverages motion differences between the background and obstructing elements to recover both layers. We show that the proposed approach learned from synthetically generated data performs well to real images.
arXiv Detail & Related papers (2020-08-11T17:59:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.