Related papers: Controllable Continuous Gaze Redirection

Controllable Continuous Gaze Redirection

URL: http://arxiv.org/abs/2010.04513v1
Date: Fri, 9 Oct 2020 11:50:06 GMT
Title: Controllable Continuous Gaze Redirection
Authors: Weihao Xia, Yujiu Yang, Jing-Hao Xue, Wensen Feng
Abstract summary: We present interpGaze, a novel framework for controllable gaze redirection. Our goal is to redirect the eye gaze of one person into any gaze direction depicted in the reference image. The proposed interpGaze outperforms state-of-the-art methods in terms of image quality and redirection precision.
Score: 47.15883248953411
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we present interpGaze, a novel framework for controllable gaze redirection that achieves both precise redirection and continuous interpolation. Given two gaze images with different attributes, our goal is to redirect the eye gaze of one person into any gaze direction depicted in the reference image or to generate continuous intermediate results. To accomplish this, we design a model including three cooperative components: an encoder, a controller and a decoder. The encoder maps images into a well-disentangled and hierarchically-organized latent space. The controller adjusts the magnitudes of latent vectors to the desired strength of corresponding attributes by altering a control vector. The decoder converts the desired representations from the attribute space to the image space. To facilitate covering the full space of gaze directions, we introduce a high-quality gaze image dataset with a large range of directions, which also benefits researchers in related areas. Extensive experimental validation and comparisons to several baseline methods show that the proposed interpGaze outperforms state-of-the-art methods in terms of image quality and redirection precision.

Related papers

ControlThinker: Unveiling Latent Semantics for Controllable Image Generation through Visual Reasoning [76.2503352325492]
ControlThinker is a novel framework that employs a "comprehend-then-generate" paradigm.<n>Latent semantics from control images are mined to enrich text prompts.<n>This enriched semantic understanding then seamlessly aids in image generation without the need for additional complex modifications.
arXiv Detail & Related papers (2025-06-04T05:56:19Z)
PixelPonder: Dynamic Patch Adaptation for Enhanced Multi-Conditional Text-to-Image Generation [24.964136963713102]
We present PixelPonder, a novel unified control framework that allows for effective control of multiple visual conditions under a single control structure. Specifically, we design a patch-level adaptive condition selection mechanism that dynamically prioritizes spatially relevant control signals at the sub-region level. Extensive experiments demonstrate that PixelPonder surpasses previous methods across different benchmark datasets.
arXiv Detail & Related papers (2025-03-09T16:27:02Z)
TraDiffusion: Trajectory-Based Training-Free Image Generation [85.39724878576584]
We propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion. This novel method allows users to effortlessly guide image generation via mouse trajectories.
arXiv Detail & Related papers (2024-08-19T07:01:43Z)
Layered Rendering Diffusion Model for Controllable Zero-Shot Image Synthesis [15.76266032768078]
This paper introduces innovative solutions to enhance spatial controllability in diffusion models reliant on text queries. We first introduce vision guidance as a foundational spatial cue within the perturbed distribution. We propose a universal framework, Layered Rendering Diffusion (LRDiff), which constructs an image-rendering process with multiple layers.
arXiv Detail & Related papers (2023-11-30T10:36:19Z)
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold [79.94300820221996]
DragGAN is a new way of controlling generative adversarial networks (GANs) DragGAN allows anyone to deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories such as animals, cars, humans, landscapes, etc. Both qualitative and quantitative comparisons demonstrate the advantage of DragGAN over prior approaches in the tasks of image manipulation and point tracking.
arXiv Detail & Related papers (2023-05-18T13:41:25Z)
Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360{\deg} Omnidirectional Image [141.10227079090419]
We propose a Multi-Projection Fusion and Refinement Network (MPFR-Net) to detect the salient objects in 360deg omnidirectional image. MPFR-Net uses the equirectangular projection image and four corresponding cube-unfolding images as inputs. Experimental results on two omnidirectional datasets demonstrate that the proposed approach outperforms the state-of-the-art methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-12-23T14:50:40Z)
CUDA-GR: Controllable Unsupervised Domain Adaptation for Gaze Redirection [3.0141238193080295]
The aim of gaze redirection is to manipulate the gaze in an image to the desired direction. Advancement in generative adversarial networks has shown excellent results in generating photo-realistic images. To enable such fine-tuned control, one needs to obtain ground truth annotations for the training data which can be very expensive.
arXiv Detail & Related papers (2021-06-21T04:39:42Z)
Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications. We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z)
Boosting Image-based Mutual Gaze Detection using Pseudo 3D Gaze [19.10872208787867]
Mutual gaze detection plays an important role in understanding human interactions. We propose a simple and effective approach to boost the performance by using an auxiliary 3D gaze estimation task during the training phase. We achieve the performance boost without additional labeling cost by training the 3D gaze estimation branch using pseudo 3D gaze labels deduced from mutual gaze labels.
arXiv Detail & Related papers (2020-10-15T15:01:41Z)
Coarse-to-Fine Gaze Redirection with Numerical and Pictorial Guidance [74.27389895574422]
We propose a novel gaze redirection framework which exploits both a numerical and a pictorial direction guidance. The proposed method outperforms the state-of-the-art approaches in terms of both image quality and redirection precision.
arXiv Detail & Related papers (2020-04-07T01:17:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.