Related papers: VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics

VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics

URL: http://arxiv.org/abs/2506.15903v1
Date: Wed, 18 Jun 2025 22:17:30 GMT
Title: VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics
Authors: Josef Kuchař, Marek Kadlčík, Michal Spiegel, Michal Štefánik,
Abstract summary: This dataset consists of over 270,000 pairs of SVG images paired with natural language edit instructions.<n>We describe the data collection process, including image pairing via CLIP similarity and instruction generation with vision-language models.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a large-scale dataset for instruction-guided vector image editing, consisting of over 270,000 pairs of SVG images paired with natural language edit instructions. Our dataset enables training and evaluation of models that modify vector graphics based on textual commands. We describe the data collection process, including image pairing via CLIP similarity and instruction generation with vision-language models. Initial experiments with state-of-the-art large language models reveal that current methods struggle to produce accurate and valid edits, underscoring the challenge of this task. To foster research in natural language-driven vector graphic generation and editing, we make our resources created within this work publicly available.

Related papers

Beyond Editing Pairs: Fine-Grained Instructional Image Editing via Multi-Scale Learnable Regions [20.617718631292696]
We develop a novel paradigm for instruction-driven image editing that leverages widely available and enormous text-image pairs.<n>Our approach introduces a multi-scale learnable region to localize and guide the editing process.<n>By treating the alignment between images and their textual descriptions as supervision and learning to generate task-specific editing regions, our method achieves high-fidelity, precise, and instruction-consistent image editing.
arXiv Detail & Related papers (2025-05-25T22:40:59Z)
Image Inpainting Models are Effective Tools for Instruction-guided Image Editing [42.63350374074953]
This technique report is for the winning solution of the CVPR2024 GenAI Media Generation Challenge Workshop's Instruction-guided Image Editing track. We use a 4-step process IIIE (Inpainting-based Instruction-guided Image Editing): editing category classification, main editing object identification, editing mask acquisition, and image inpainting. Results show that through proper combinations of language models and image inpainting models, our pipeline can reach a high success rate with satisfying visual quality.
arXiv Detail & Related papers (2024-07-18T03:55:33Z)
SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing [53.00272278754867]
SEED-Data-Edit is a hybrid dataset for instruction-guided image editing. High-quality editing data produced by an automated pipeline. Real-world scenario data collected from the internet. High-precision multi-turn editing data annotated by humans.
arXiv Detail & Related papers (2024-05-07T04:55:47Z)
InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists [66.85125112199898]
We develop a unified language interface for computer vision tasks that abstracts away task-specific design choices. Our model, dubbed InstructCV, performs competitively compared to other generalist and task-specific vision models.
arXiv Detail & Related papers (2023-09-30T14:26:43Z)
iEdit: Localised Text-guided Image Editing with Weak Supervision [53.082196061014734]
We propose a novel learning method for text-guided image editing. It generates images conditioned on a source image and a textual edit prompt. It shows favourable results against its counterparts in terms of image fidelity, CLIP alignment score and qualitatively for editing both generated and real images.
arXiv Detail & Related papers (2023-05-10T07:39:14Z)
InstructPix2Pix: Learning to Follow Image Editing Instructions [103.77092910685764]
We propose a method for editing images from human instructions. given an input image and a written instruction that tells the model what to do, our model follows these instructions to edit the image. We show compelling editing results for a diverse collection of input images and written instructions.
arXiv Detail & Related papers (2022-11-17T18:58:43Z)
Learning by Planning: Language-Guided Global Image Editing [53.72807421111136]
We develop a text-to-operation model to map the vague editing language request into a series of editing operations. The only supervision in the task is the target image, which is insufficient for a stable training of sequential decisions. We propose a novel operation planning algorithm to generate possible editing sequences from the target image as pseudo ground truth.
arXiv Detail & Related papers (2021-06-24T16:30:03Z)
Graph Edit Distance Reward: Learning to Edit Scene Graph [69.39048809061714]
We propose a new method to edit the scene graph according to the user instructions, which has never been explored. To be specific, in order to learn editing scene graphs as the semantics given by texts, we propose a Graph Edit Distance Reward. In the context of text-editing image retrieval, we validate the effectiveness of our method in CSS and CRIR dataset.
arXiv Detail & Related papers (2020-08-15T04:52:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.