A Diffusion-Based Framework for Occluded Object Movement
- URL: http://arxiv.org/abs/2504.01873v1
- Date: Wed, 02 Apr 2025 16:29:30 GMT
- Title: A Diffusion-Based Framework for Occluded Object Movement
- Authors: Zheng-Peng Duan, Jiawei Zhang, Siyu Liu, Zheng Lin, Chun-Le Guo, Dongqing Zou, Jimmy Ren, Chongyi Li,
- Abstract summary: We propose a Diffusion-based framework specifically designed for Occluded Object Movement, named DiffOOM.<n>The de-occlusion branch utilizes a background color-fill strategy and a continuously updated object mask to focus the diffusion process on completing the obscured portion of the target object.<n> Concurrently, the movement branch employs latent optimization to place the completed object in the target location and adopts local text-conditioned guidance to integrate the object into new surroundings appropriately.
- Score: 39.6345172890042
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Seamlessly moving objects within a scene is a common requirement for image editing, but it is still a challenge for existing editing methods. Especially for real-world images, the occlusion situation further increases the difficulty. The main difficulty is that the occluded portion needs to be completed before movement can proceed. To leverage the real-world knowledge embedded in the pre-trained diffusion models, we propose a Diffusion-based framework specifically designed for Occluded Object Movement, named DiffOOM. The proposed DiffOOM consists of two parallel branches that perform object de-occlusion and movement simultaneously. The de-occlusion branch utilizes a background color-fill strategy and a continuously updated object mask to focus the diffusion process on completing the obscured portion of the target object. Concurrently, the movement branch employs latent optimization to place the completed object in the target location and adopts local text-conditioned guidance to integrate the object into new surroundings appropriately. Extensive evaluations demonstrate the superior performance of our method, which is further validated by a comprehensive user study.
Related papers
- OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting [54.525583840585305]
We introduce OmniPaint, a unified framework that re-conceptualizes object removal and insertion as interdependent processes.<n>Our novel CFD metric offers a robust, reference-free evaluation of context consistency and object hallucination.
arXiv Detail & Related papers (2025-03-11T17:55:27Z) - Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways [13.08168394252538]
Erase inpainting aims to precisely remove target objects within masked regions while preserving the overall consistency of the surrounding content.<n>We propose a novel Erase Diffusion, termed EraDiff, aimed at unleashing the potential power of standard diffusion in the context of object removal.<n>Our proposed EraDiff achieves state-of-the-art performance on the OpenImages V5 dataset and demonstrates significant superiority in real-world scenarios.
arXiv Detail & Related papers (2025-03-10T08:06:51Z) - Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion [29.770096013143117]
We extend the concept of Affordance from human-centered image composition tasks to a more general object-scene composition framework.<n>We propose a Mask-Aware Dual Diffusion (MADD) model, which utilizes a dual-stream architecture to simultaneously denoise the RGB image and the insertion mask.<n>Our method outperforms the state-of-the-art methods and exhibits strong generalization performance on in-the-wild images.
arXiv Detail & Related papers (2024-12-19T02:23:13Z) - DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - The Background Also Matters: Background-Aware Motion-Guided Objects
Discovery [2.6442319761949875]
We propose a Background-aware Motion-guided Objects Discovery method.
We leverage masks of moving objects extracted from optical flow and design a learning mechanism to extend them to the true foreground.
This enables a joint learning of the objects discovery task and the object/non-object separation.
arXiv Detail & Related papers (2023-11-05T12:35:47Z) - ZoomNeXt: A Unified Collaborative Pyramid Network for Camouflaged Object Detection [70.11264880907652]
Recent object (COD) attempts to segment objects visually blended into their surroundings, which is extremely complex and difficult in real-world scenarios.
We propose an effective unified collaborative pyramid network that mimics human behavior when observing vague images and camouflaged zooming in and out.
Our framework consistently outperforms existing state-of-the-art methods in image and video COD benchmarks.
arXiv Detail & Related papers (2023-10-31T06:11:23Z) - Diffusion Model for Camouflaged Object Detection [2.592600158870236]
We propose a diffusion-based framework for camouflaged object detection, termed diffCOD.
The proposed method achieves favorable performance compared to the existing 11 state-of-the-art methods.
arXiv Detail & Related papers (2023-08-01T05:50:33Z) - DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing [94.24479528298252]
DragGAN is an interactive point-based image editing framework that achieves impressive editing results with pixel-level precision.
By harnessing large-scale pretrained diffusion models, we greatly enhance the applicability of interactive point-based editing on both real and diffusion-generated images.
We present a challenging benchmark dataset called DragBench to evaluate the performance of interactive point-based image editing methods.
arXiv Detail & Related papers (2023-06-26T06:04:09Z) - Discovering Objects that Can Move [55.743225595012966]
We study the problem of object discovery -- separating objects from the background without manual labels.
Existing approaches utilize appearance cues, such as color, texture, and location, to group pixels into object-like regions.
We choose to focus on dynamic objects -- entities that can move independently in the world.
arXiv Detail & Related papers (2022-03-18T21:13:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.