Dual In-painting Model for Unsupervised Gaze Correction and Animation in
the Wild
- URL: http://arxiv.org/abs/2008.03834v1
- Date: Sun, 9 Aug 2020 23:14:16 GMT
- Title: Dual In-painting Model for Unsupervised Gaze Correction and Animation in
the Wild
- Authors: Jichao Zhang, Jingjing Chen, Hao Tang, Wei Wang, Yan Yan, Enver
Sangineto, Nicu Sebe
- Abstract summary: We present a solution that works without the need for precise annotations of the gaze angle and the head pose.
Our method consists of three novel modules: the Gaze Correction module (GCM), the Gaze Animation module (GAM), and the Pretrained Autoencoder module (PAM)
- Score: 82.42401132933462
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we address the problem of unsupervised gaze correction in the
wild, presenting a solution that works without the need for precise annotations
of the gaze angle and the head pose. We have created a new dataset called
CelebAGaze, which consists of two domains X, Y, where the eyes are either
staring at the camera or somewhere else. Our method consists of three novel
modules: the Gaze Correction module (GCM), the Gaze Animation module (GAM), and
the Pretrained Autoencoder module (PAM). Specifically, GCM and GAM separately
train a dual in-painting network using data from the domain $X$ for gaze
correction and data from the domain $Y$ for gaze animation. Additionally, a
Synthesis-As-Training method is proposed when training GAM to encourage the
features encoded from the eye region to be correlated with the angle
information, resulting in a gaze animation which can be achieved by
interpolation in the latent space. To further preserve the identity
information~(e.g., eye shape, iris color), we propose the PAM with an
Autoencoder, which is based on Self-Supervised mirror learning where the
bottleneck features are angle-invariant and which works as an extra input to
the dual in-painting models. Extensive experiments validate the effectiveness
of the proposed method for gaze correction and gaze animation in the wild and
demonstrate the superiority of our approach in producing more compelling
results than state-of-the-art baselines. Our code, the pretrained models and
the supplementary material are available at:
https://github.com/zhangqianhui/GazeAnimation.
Related papers
- Merging Multiple Datasets for Improved Appearance-Based Gaze Estimation [10.682719521609743]
Two-stage Transformer-based Gaze-feature Fusion (TTGF) method uses transformers to merge information from each eye and the face separately and then merge across the two eyes.
Our proposed Gaze Adaptation Module (GAM) method handles annotation inconsis-tency by applying a Gaze Adaption Module for each dataset to correct gaze estimates from a single shared estimator.
arXiv Detail & Related papers (2024-09-02T02:51:40Z) - Pix2Gif: Motion-Guided Diffusion for GIF Generation [70.64240654310754]
We present Pix2Gif, a motion-guided diffusion model for image-to-GIF (video) generation.
We propose a new motion-guided warping module to spatially transform the features of the source image conditioned on the two types of prompts.
In preparation for the model training, we meticulously curated data by extracting coherent image frames from the TGIF video-caption dataset.
arXiv Detail & Related papers (2024-03-07T16:18:28Z) - The Change You Want to See (Now in 3D) [65.61789642291636]
The goal of this paper is to detect what has changed, if anything, between two "in the wild" images of the same 3D scene.
We contribute a change detection model that is trained entirely on synthetic data and is class-agnostic.
We release a new evaluation dataset consisting of real-world image pairs with human-annotated differences.
arXiv Detail & Related papers (2023-08-21T01:59:45Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Unsupervised High-Resolution Portrait Gaze Correction and Animation [81.19271523855554]
This paper proposes a gaze correction and animation method for high-resolution, unconstrained portrait images.
We first create two new portrait datasets: CelebGaze and high-resolution CelebHQGaze.
We formulate the gaze correction task as an image inpainting problem, addressed using a Gaze Correction Module and a Gaze Animation Module.
arXiv Detail & Related papers (2022-07-01T08:14:42Z) - L2CS-Net: Fine-Grained Gaze Estimation in Unconstrained Environments [2.5234156040689237]
We propose a robust CNN-based model for predicting gaze in unconstrained settings.
We use two identical losses, one for each angle, to improve network learning and increase its generalization.
Our proposed model achieves state-of-the-art accuracy of 3.92deg and 10.41deg on MPIIGaze and Gaze360 datasets, respectively.
arXiv Detail & Related papers (2022-03-07T12:35:39Z) - Adversarial Bipartite Graph Learning for Video Domain Adaptation [50.68420708387015]
Domain adaptation techniques, which focus on adapting models between distributionally different domains, are rarely explored in the video recognition area.
Recent works on visual domain adaptation which leverage adversarial learning to unify the source and target video representations are not highly effective on the videos.
This paper proposes an Adversarial Bipartite Graph (ABG) learning framework which directly models the source-target interactions.
arXiv Detail & Related papers (2020-07-31T03:48:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.