Related papers: SliderEdit: Continuous Image Editing with Fine-Grained Instruction Control

SliderEdit: Continuous Image Editing with Fine-Grained Instruction Control

URL: http://arxiv.org/abs/2511.09715v1
Date: Fri, 14 Nov 2025 01:05:52 GMT
Title: SliderEdit: Continuous Image Editing with Fine-Grained Instruction Control
Authors: Arman Zarei, Samyadeep Basu, Mobina Pournemat, Sayan Nag, Ryan Rossi, Soheil Feizi,
Abstract summary: We introduce SliderEdit, a framework for continuous image editing with fine-grained, interpretable instruction control.<n>Given a multi-part edit instruction, SliderEdit disentangles the individual instructions and exposes each as a globally trained slider.<n>Our results pave the way for interactive, instruction-driven image manipulation with continuous and compositional control.
Score: 50.76070785417023
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Instruction-based image editing models have recently achieved impressive performance, enabling complex edits to an input image from a multi-instruction prompt. However, these models apply each instruction in the prompt with a fixed strength, limiting the user's ability to precisely and continuously control the intensity of individual edits. We introduce SliderEdit, a framework for continuous image editing with fine-grained, interpretable instruction control. Given a multi-part edit instruction, SliderEdit disentangles the individual instructions and exposes each as a globally trained slider, allowing smooth adjustment of its strength. Unlike prior works that introduced slider-based attribute controls in text-to-image generation, typically requiring separate training or fine-tuning for each attribute or concept, our method learns a single set of low-rank adaptation matrices that generalize across diverse edits, attributes, and compositional instructions. This enables continuous interpolation along individual edit dimensions while preserving both spatial locality and global semantic consistency. We apply SliderEdit to state-of-the-art image editing models, including FLUX-Kontext and Qwen-Image-Edit, and observe substantial improvements in edit controllability, visual consistency, and user steerability. To the best of our knowledge, we are the first to explore and propose a framework for continuous, fine-grained instruction control in instruction-based image editing models. Our results pave the way for interactive, instruction-driven image manipulation with continuous and compositional control.

Related papers

NumeriKontrol: Adding Numeric Control to Diffusion Transformers for Instruction-based Image Editing [12.728322570816248]
We introduce NumeriKontrol, a framework that allows users to adjust image attributes using continuous attribute values with common units.<n>Thanks to task-separated design, our approach supports zero-separated multi-condition editing.<n>We synthesize precise training data from reliable sources, including high-fidelity DSLR and DSLR cameras.
arXiv Detail & Related papers (2025-11-28T11:43:52Z)
Group Relative Attention Guidance for Image Editing [38.299491082179905]
Group Relative Attention Guidance (GRAG) is a simple yet effective method that modulates the focus of the model on the input image relative to the editing instruction.<n>Our code will be released at https://www.littlemisfit.com/little-misfit/GRAG-Image-Editing.
arXiv Detail & Related papers (2025-10-28T17:22:44Z)
Kontinuous Kontext: Continuous Strength Control for Instruction-based Image Editing [76.44219733285898]
Kontinuous Kontext is an instruction-driven editing model that provides a new dimension of control over edit strength.<n>A lightweight projector network maps the input scalar and the edit instruction to coefficients in the model's modulation space.<n>For training our model, we synthesize a diverse dataset of image-edit-instruction-strength quadruplets using existing generative models.
arXiv Detail & Related papers (2025-10-09T17:51:03Z)
SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder [52.754326452329956]
We introduce a method for disentangled and continuous editing through token-level manipulation of text embeddings.<n>The edits are applied by manipulating the embeddings along carefully chosen directions, which control the strength of the target attribute.<n>Our method operates directly on text embeddings without modifying the diffusion process, making it model agnostic and broadly applicable to various image backbones.
arXiv Detail & Related papers (2025-10-06T17:51:04Z)
PromptArtisan: Multi-instruction Image Editing in Single Pass with Complete Attention Control [1.0079049259808768]
PromptArtisan is a groundbreaking approach to multi-instruction image editing.<n>It achieves remarkable results in a single pass, eliminating the need for time-consuming iterative refinement.
arXiv Detail & Related papers (2025-02-14T16:11:57Z)
UIP2P: Unsupervised Instruction-based Image Editing via Edit Reversibility Constraint [87.20985852686785]
We propose an unsupervised instruction-based image editing approach that removes the need for ground-truth edited images during training.<n>Our approach addresses these challenges by introducing a novel editing mechanism called Edit Reversibility Constraint (ERC), which applies forward and reverse edits in one training step.<n>This allows us to bypass the need for ground-truth edited images and unlock training for the first time on datasets comprising either real image-caption pairs or image-caption-instruction triplets.
arXiv Detail & Related papers (2024-12-19T18:59:58Z)
BrushEdit: All-In-One Image Inpainting and Editing [76.93556996538398]
BrushEdit is a novel inpainting-based instruction-guided image editing paradigm.<n>We devise a system enabling free-form instruction editing by integrating MLLMs and a dual-branch image inpainting model.<n>Our framework effectively combines MLLMs and inpainting models, achieving superior performance across seven metrics.
arXiv Detail & Related papers (2024-12-13T17:58:06Z)
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea [88.79769371584491]
We present AnyEdit, a comprehensive multi-modal instruction editing dataset.<n>We ensure the diversity and quality of the AnyEdit collection through three aspects: initial data diversity, adaptive editing process, and automated selection of editing results.<n>Experiments on three benchmark datasets show that AnyEdit consistently boosts the performance of diffusion-based editing models.
arXiv Detail & Related papers (2024-11-24T07:02:56Z)
Optimisation-Based Multi-Modal Semantic Image Editing [58.496064583110694]
We propose an inference-time editing optimisation to accommodate multiple editing instruction types. By allowing to adjust the influence of each loss function, we build a flexible editing solution that can be adjusted to user preferences. We evaluate our method using text, pose and scribble edit conditions, and highlight our ability to achieve complex edits.
arXiv Detail & Related papers (2023-11-28T15:31:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.