Related papers: RePainter: Empowering E-commerce Object Removal via Spatial-matting Reinforcement Learning

RePainter: Empowering E-commerce Object Removal via Spatial-matting Reinforcement Learning

URL: http://arxiv.org/abs/2510.07721v1
Date: Thu, 09 Oct 2025 02:57:33 GMT
Title: RePainter: Empowering E-commerce Object Removal via Spatial-matting Reinforcement Learning
Authors: Zipeng Guo, Lichen Ma, Xiaolong Fu, Gaojing Zhou, Lan Yang, Yuchen Zhou, Linkai Liu, Yu He, Ximan Liu, Shiping Dong, Jingling Fu, Zhen Chen, Yu Shi, Junshi Huang, Jason Li, Chao Gou,
Abstract summary: Repainter is a reinforcement learning framework that integrates spatial-matting trajectory refinement with Group Relative Policy Optimization.<n>Our approach modulates attention mechanisms to emphasize background context, generating higher-reward samples and reducing unwanted object insertion.<n>We contribute EcomPaint-100K, a high-quality, large-scale e-commerce inpainting dataset, and a standardized benchmark EcomPaint-Bench for fair evaluation.
Score: 26.053034031708254
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In web data, product images are central to boosting user engagement and advertising efficacy on e-commerce platforms, yet the intrusive elements such as watermarks and promotional text remain major obstacles to delivering clear and appealing product visuals. Although diffusion-based inpainting methods have advanced, they still face challenges in commercial settings due to unreliable object removal and limited domain-specific adaptation. To tackle these challenges, we propose Repainter, a reinforcement learning framework that integrates spatial-matting trajectory refinement with Group Relative Policy Optimization (GRPO). Our approach modulates attention mechanisms to emphasize background context, generating higher-reward samples and reducing unwanted object insertion. We also introduce a composite reward mechanism that balances global, local, and semantic constraints, effectively reducing visual artifacts and reward hacking. Additionally, we contribute EcomPaint-100K, a high-quality, large-scale e-commerce inpainting dataset, and a standardized benchmark EcomPaint-Bench for fair evaluation. Extensive experiments demonstrate that Repainter significantly outperforms state-of-the-art methods, especially in challenging scenes with intricate compositions. We will release our code and weights upon acceptance.

Related papers

AUVIC: Adversarial Unlearning of Visual Concepts for Multi-modal Large Language Models [63.05306474002547]
Regulatory frameworks mandating the 'right to be forgotten' drive the need for machine unlearning.<n>We introduce AUVIC, a novel visual concept unlearning framework for MLLMs.<n>We show that AUVIC achieves state-of-the-art target forgetting rates while incurs minimal performance degradation on non-target concepts.
arXiv Detail & Related papers (2025-11-14T13:35:32Z)
DreamPainter: Image Background Inpainting for E-commerce Scenarios [9.12444106077783]
We introduce DreamPainter, a novel framework that incorporates text prompts for control and reference image information as an additional control signal.<n>Our approach significantly outperforms state-of-the-art methods, maintaining high product consistency while effectively integrating both text prompt and reference image information.
arXiv Detail & Related papers (2025-08-04T07:54:37Z)
ObjectClear: Complete Object Removal via Object-Effect Attention [56.2893552300215]
We introduce a new dataset for OBject-Effect Removal, named OBER, which provides paired images with and without object effects, along with precise masks for both objects and their associated visual artifacts.<n>We propose a novel framework, ObjectClear, which incorporates an object-effect attention mechanism to guide the model toward the foreground removal regions by learning attention masks.<n>Experiments demonstrate that ObjectClear outperforms existing methods, achieving improved object-effect removal quality and background fidelity, especially in complex scenarios.
arXiv Detail & Related papers (2025-05-28T17:51:17Z)
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting [54.525583840585305]
We introduce OmniPaint, a unified framework that re-conceptualizes object removal and insertion as interdependent processes.<n>Our novel CFD metric offers a robust, reference-free evaluation of context consistency and object hallucination.
arXiv Detail & Related papers (2025-03-11T17:55:27Z)
OmniEraser: Remove Objects and Their Effects in Images with Paired Video-Frame Data [21.469971783624402]
In this paper, we propose Video4Removal, a large-scale dataset comprising over 100,000 high-quality samples with realistic object shadows and reflections.<n>By constructing object-background pairs from video frames with off-the-shelf vision models, the labor costs of data acquisition can be significantly reduced.<n>To avoid generating shape-like artifacts and unintended content, we propose Object-Background Guidance.<n>We present OmniEraser, a novel method that seamlessly removes objects and their visual effects using only object masks as input.
arXiv Detail & Related papers (2025-01-13T15:12:40Z)
E-Commerce Inpainting with Mask Guidance in Controlnet for Reducing Overcompletion [13.67619785783182]
This paper systematically analyzes and addresses a core pain point in diffusion model generation: overcompletion. Our method has achieved promising results in practical applications and we hope it can serve as an inspiring technical report in this field.
arXiv Detail & Related papers (2024-09-15T10:10:13Z)
CLIPAway: Harmonizing Focused Embeddings for Removing Objects via Diffusion Models [16.58831310165623]
CLIPAway is a novel approach leveraging CLIP embeddings to focus on background regions while excluding foreground elements. It enhances inpainting accuracy and quality by identifying embeddings that prioritize the background. Unlike other methods that rely on specialized training datasets or costly manual annotations, CLIPAway provides a flexible, plug-and-play solution.
arXiv Detail & Related papers (2024-06-13T17:50:28Z)
A Straightforward Gradient-Based Approach for High-Tc Superconductor Design: Leveraging Domain Knowledge via Adaptive Constraints [0.0]
Material design aims to discover novel compounds with desired properties.<n> Conventional element-substitution approaches readily incorporate various domain knowledge but remain confined to a narrow search space.<n>Deep generative models efficiently explore vast compositional landscapes, yet they struggle to flexibly integrate domain knowledge.<n>We propose a gradient-based material design framework that combines these strengths, offering both efficiency and adaptability.
arXiv Detail & Related papers (2024-03-20T14:23:17Z)
ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios. We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out. Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z)
PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant Semantic Segmentation [50.556961575275345]
We propose a perception-aware fusion framework to promote segmentation robustness in adversarial scenes. We show that our scheme substantially enhances the robustness, with gains of 15.3% mIOU, compared with advanced competitors.
arXiv Detail & Related papers (2023-08-08T01:55:44Z)
Searching a Compact Architecture for Robust Multi-Exposure Image Fusion [55.37210629454589]
Two major stumbling blocks hinder the development, including pixel misalignment and inefficient inference. This study introduces an architecture search-based paradigm incorporating self-alignment and detail repletion modules for robust multi-exposure image fusion. The proposed method outperforms various competitive schemes, achieving a noteworthy 3.19% improvement in PSNR for general scenarios and an impressive 23.5% enhancement in misaligned scenarios.
arXiv Detail & Related papers (2023-05-20T17:01:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.