Repositioning the Subject within Image
- URL: http://arxiv.org/abs/2401.16861v2
- Date: Sun, 17 Mar 2024 12:15:34 GMT
- Title: Repositioning the Subject within Image
- Authors: Yikai Wang, Chenjie Cao, Ke Fan, Qiaole Dong, Yifan Li, Xiangyang Xue, Yanwei Fu,
- Abstract summary: We introduce an innovative dynamic manipulation task, subject repositioning.
This task involves relocating a user-specified subject to a desired position while preserving the image's fidelity.
Our research reveals that the fundamental sub-tasks of subject repositioning can be effectively reformulated as a unified, prompt-guided inpainting task.
- Score: 78.8467524191102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current image manipulation primarily centers on static manipulation, such as replacing specific regions within an image or altering its overall style. In this paper, we introduce an innovative dynamic manipulation task, subject repositioning. This task involves relocating a user-specified subject to a desired position while preserving the image's fidelity. Our research reveals that the fundamental sub-tasks of subject repositioning, which include filling the void left by the repositioned subject, reconstructing obscured portions of the subject and blending the subject to be consistent with surrounding areas, can be effectively reformulated as a unified, prompt-guided inpainting task. Consequently, we can employ a single diffusion generative model to address these sub-tasks using various task prompts learned through our proposed task inversion technique. Additionally, we integrate pre-processing and post-processing techniques to further enhance the quality of subject repositioning. These elements together form our SEgment-gEnerate-and-bLEnd (SEELE) framework. To assess SEELE's effectiveness in subject repositioning, we assemble a real-world subject repositioning dataset called ReS. Results of SEELE on ReS demonstrate its efficacy.
Related papers
- FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior [50.0535198082903]
We offer a novel approach to image composition, which integrates multiple input images into a single, coherent image.
We showcase the potential of utilizing the powerful generative prior inherent in large-scale pre-trained diffusion models to accomplish generic image composition.
arXiv Detail & Related papers (2024-07-06T03:35:43Z) - DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - Cones 2: Customizable Image Synthesis with Multiple Subjects [50.54010141032032]
We study how to efficiently represent a particular subject as well as how to appropriately compose different subjects.
By rectifying the activations in the cross-attention map, the layout appoints and separates the location of different subjects in the image.
arXiv Detail & Related papers (2023-05-30T18:00:06Z) - Editing Out-of-domain GAN Inversion via Differential Activations [56.62964029959131]
We propose a novel GAN prior based editing framework to tackle the out-of-domain inversion problem with a composition-decomposition paradigm.
With the aid of the generated Diff-CAM mask, a coarse reconstruction can intuitively be composited by the paired original and edited images.
In the decomposition phase, we further present a GAN prior based deghosting network for separating the final fine edited image from the coarse reconstruction.
arXiv Detail & Related papers (2022-07-17T10:34:58Z) - Situational Perception Guided Image Matting [16.1897179939677]
We propose a Situational Perception Guided Image Matting (SPG-IM) method that mitigates subjective bias of matting annotations.
SPG-IM can better associate inter-objects and object-to-environment saliency, and compensate the subjective nature of image matting.
arXiv Detail & Related papers (2022-04-20T07:35:51Z) - Image Restoration using Feature-guidance [43.02281823557039]
We present a new approach suitable for handling the image-specific and spatially-varying nature of degradation in images.
We decompose the restoration task into two stages of degradation localization and degraded region-guided restoration.
We demonstrate that the model trained for this auxiliary task contains vital region knowledge, which can be exploited to guide the restoration network's training.
arXiv Detail & Related papers (2022-01-01T13:10:19Z) - Adversarial Image Composition with Auxiliary Illumination [53.89445873577062]
We propose an Adversarial Image Composition Net (AIC-Net) that achieves realistic image composition.
A novel branched generation mechanism is proposed, which disentangles the generation of shadows and the transfer of foreground styles.
Experiments on pedestrian and car composition tasks show that the proposed AIC-Net achieves superior composition performance.
arXiv Detail & Related papers (2020-09-17T12:58:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.