OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting
- URL: http://arxiv.org/abs/2503.08677v2
- Date: Wed, 12 Mar 2025 17:05:47 GMT
- Title: OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting
- Authors: Yongsheng Yu, Ziyun Zeng, Haitian Zheng, Jiebo Luo,
- Abstract summary: We introduce OmniPaint, a unified framework that re-conceptualizes object removal and insertion as interdependent processes.<n>Our novel CFD metric offers a robust, reference-free evaluation of context consistency and object hallucination.
- Score: 54.525583840585305
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion-based generative models have revolutionized object-oriented image editing, yet their deployment in realistic object removal and insertion remains hampered by challenges such as the intricate interplay of physical effects and insufficient paired training data. In this work, we introduce OmniPaint, a unified framework that re-conceptualizes object removal and insertion as interdependent processes rather than isolated tasks. Leveraging a pre-trained diffusion prior along with a progressive training pipeline comprising initial paired sample optimization and subsequent large-scale unpaired refinement via CycleFlow, OmniPaint achieves precise foreground elimination and seamless object insertion while faithfully preserving scene geometry and intrinsic properties. Furthermore, our novel CFD metric offers a robust, reference-free evaluation of context consistency and object hallucination, establishing a new benchmark for high-fidelity image editing. Project page: https://yeates.github.io/OmniPaint-Page/
Related papers
- FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors [64.54220123913154]
We introduce FramePainter as an efficient instantiation of image-to-video generation problem.
It only uses a lightweight sparse control encoder to inject editing signals.
It domainantly outperforms previous state-of-the-art methods with far less training data.
arXiv Detail & Related papers (2025-01-14T16:09:16Z) - OmniEraser: Remove Objects and Their Effects in Images with Paired Video-Frame Data [21.469971783624402]
In this paper, we propose Video4Removal, a large-scale dataset comprising over 100,000 high-quality samples with realistic object shadows and reflections.<n>By constructing object-background pairs from video frames with off-the-shelf vision models, the labor costs of data acquisition can be significantly reduced.<n>To avoid generating shape-like artifacts and unintended content, we propose Object-Background Guidance.<n>We present OmniEraser, a novel method that seamlessly removes objects and their visual effects using only object masks as input.
arXiv Detail & Related papers (2025-01-13T15:12:40Z) - PixelMan: Consistent Object Editing with Diffusion Models via Pixel Manipulation and Generation [15.342060815068347]
PixelMan is an inversion-free and training-free method for achieving consistent object editing via Pixel Manipulation and generation.
We show that in as few as 16 inference steps, PixelMan outperforms a range of state-of-the-art training-based and training-free methods.
arXiv Detail & Related papers (2024-12-18T19:24:15Z) - Generative Image Layer Decomposition with Visual Effects [49.75021036203426]
LayerDecomp is a generative framework for image layer decomposition.<n>It produces clean backgrounds and high-quality transparent foregrounds with faithfully preserved visual effects.<n>Our method achieves superior quality in layer decomposition, outperforming existing approaches in object removal and spatial editing tasks.
arXiv Detail & Related papers (2024-11-26T20:26:49Z) - InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models [46.587906540660455]
We introduce InVi, an approach for inserting or replacing objects within videos using off-the-shelf, text-to-image latent diffusion models.
InVi achieves realistic object insertion with consistent blending and coherence across frames, outperforming existing methods.
arXiv Detail & Related papers (2024-07-15T17:55:09Z) - DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting [63.567363455092234]
RefFusion is a novel 3D inpainting method based on a multi-scale personalization of an image inpainting diffusion model to the given reference view.
Our framework achieves state-of-the-art results for object removal while maintaining high controllability.
arXiv Detail & Related papers (2024-04-16T17:50:02Z) - ObjectStitch: Generative Object Compositing [43.206123360578665]
We propose a self-supervised framework for object compositing using conditional diffusion models.
Our framework can transform the viewpoint, geometry, color and shadow of the generated object while requiring no manual labeling.
Our method outperforms relevant baselines in both realism and faithfulness of the synthesized result images in a user study on various real-world images.
arXiv Detail & Related papers (2022-12-02T02:15:13Z) - High-Resolution Image Inpainting with Iterative Confidence Feedback and
Guided Upsampling [122.06593036862611]
Existing image inpainting methods often produce artifacts when dealing with large holes in real applications.
We propose an iterative inpainting method with a feedback mechanism.
Experiments show that our method significantly outperforms existing methods in both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2020-05-24T13:23:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.