Look here! A parametric learning based approach to redirect visual
attention
- URL: http://arxiv.org/abs/2008.05413v1
- Date: Wed, 12 Aug 2020 16:08:36 GMT
- Title: Look here! A parametric learning based approach to redirect visual
attention
- Authors: Youssef Alami Mejjati and Celso F. Gomez and Kwang In Kim and Eli
Shechtman and Zoya Bylinskii
- Abstract summary: We introduce an automatic method to make an image region more attention-capturing via subtle image edits.
Our model predicts a distinct set of global parametric transformations to be applied to the foreground and background image regions.
Our edits enable inference at interactive rates on any image size, and easily generalize to videos.
- Score: 49.609412873346386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Across photography, marketing, and website design, being able to direct the
viewer's attention is a powerful tool. Motivated by professional workflows, we
introduce an automatic method to make an image region more attention-capturing
via subtle image edits that maintain realism and fidelity to the original. From
an input image and a user-provided mask, our GazeShiftNet model predicts a
distinct set of global parametric transformations to be applied to the
foreground and background image regions separately. We present the results of
quantitative and qualitative experiments that demonstrate improvements over
prior state-of-the-art. In contrast to existing attention shifting algorithms,
our global parametric approach better preserves image semantics and avoids
typical generative artifacts. Our edits enable inference at interactive rates
on any image size, and easily generalize to videos. Extensions of our model
allow for multi-style edits and the ability to both increase and attenuate
attention in an image region. Furthermore, users can customize the edited
images by dialing the edits up or down via interpolations in parameter space.
This paper presents a practical tool that can simplify future image editing
pipelines.
Related papers
- PartEdit: Fine-Grained Image Editing using Pre-Trained Diffusion Models [80.98455219375862]
We present the first text-based image editing approach for object parts based on pre-trained diffusion models.
Our approach is preferred by users 77-90% of the time in conducted user studies.
arXiv Detail & Related papers (2025-02-06T13:08:43Z) - PIXELS: Progressive Image Xemplar-based Editing with Latent Surgery [10.594261300488546]
We introduce a novel framework for progressive exemplar-driven editing with off-the-shelf diffusion models, dubbed PIXELS.
PIXELS provides granular control over edits, allowing adjustments at the pixel or region level.
We demonstrate that PIXELS delivers high-quality edits efficiently, leading to a notable improvement in quantitative metrics as well as human evaluation.
arXiv Detail & Related papers (2025-01-16T20:26:30Z) - DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - Prompt Tuning Inversion for Text-Driven Image Editing Using Diffusion
Models [6.34777393532937]
We propose an accurate and quick inversion technique, Prompt Tuning Inversion, for text-driven image editing.
Our proposed editing method consists of a reconstruction stage and an editing stage.
Experiments on ImageNet demonstrate the superior editing performance of our method compared to the state-of-the-art baselines.
arXiv Detail & Related papers (2023-05-08T03:34:33Z) - Zero-shot Image-to-Image Translation [57.46189236379433]
We propose pix2pix-zero, an image-to-image translation method that can preserve the original image without manual prompting.
We propose cross-attention guidance, which aims to retain the cross-attention maps of the input image throughout the diffusion process.
Our method does not need additional training for these edits and can directly use the existing text-to-image diffusion model.
arXiv Detail & Related papers (2023-02-06T18:59:51Z) - End-to-End Visual Editing with a Generatively Pre-Trained Artist [78.5922562526874]
We consider the targeted image editing problem: blending a region in a source image with a driver image that specifies the desired change.
We propose a self-supervised approach that simulates edits by augmenting off-the-shelf images in a target domain.
We show that different blending effects can be learned by an intuitive control of the augmentation process, with no other changes required to the model architecture.
arXiv Detail & Related papers (2022-05-03T17:59:30Z) - Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space
Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks.
Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism.
We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.