A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation
- URL: http://arxiv.org/abs/2001.00187v1
- Date: Wed, 1 Jan 2020 10:39:03 GMT
- Title: A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation
- Authors: Yihua Cheng, Shiyao Huang, Fei Wang, Chen Qian, Feng Lu
- Abstract summary: We propose a coarse-to-fine strategy which estimates a basic gaze direction from face image and refines it with corresponding residual predicted from eye images.
We construct a coarse-to-fine adaptive network named CA-Net and achieve state-of-the-art performances on MPIIGaze and EyeDiap.
- Score: 24.8796573846653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human gaze is essential for various appealing applications. Aiming at more
accurate gaze estimation, a series of recent works propose to utilize face and
eye images simultaneously. Nevertheless, face and eye images only serve as
independent or parallel feature sources in those works, the intrinsic
correlation between their features is overlooked. In this paper we make the
following contributions: 1) We propose a coarse-to-fine strategy which
estimates a basic gaze direction from face image and refines it with
corresponding residual predicted from eye images. 2) Guided by the proposed
strategy, we design a framework which introduces a bi-gram model to bridge gaze
residual and basic gaze direction, and an attention component to adaptively
acquire suitable fine-grained feature. 3) Integrating the above innovations, we
construct a coarse-to-fine adaptive network named CA-Net and achieve
state-of-the-art performances on MPIIGaze and EyeDiap.
Related papers
- Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection [58.228940066769596]
We introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system.
Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context.
Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets.
arXiv Detail & Related papers (2024-05-08T03:13:20Z) - GazeFusion: Saliency-guided Image Generation [50.37783903347613]
Diffusion models offer unprecedented image generation capabilities given just a text prompt.
We present a saliency-guided framework to incorporate the data priors of human visual attention into the generation process.
arXiv Detail & Related papers (2024-03-16T21:01:35Z) - NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation [37.977032771941715]
We propose a novel Head-Eye redirection parametric model based on Neural Radiance Field.
Our model can decouple the face and eyes for separate neural rendering.
It can achieve the purpose of separately controlling the attributes of the face, identity, illumination, and eye gaze direction.
arXiv Detail & Related papers (2022-12-30T13:52:28Z) - Bipartite Graph Reasoning GANs for Person Pose and Facial Image
Synthesis [201.39323496042527]
We present a novel bipartite graph reasoning Generative Adversarial Network (BiGraphGAN) for two challenging tasks: person pose and facial image synthesis.
The proposed graph generator consists of two novel blocks that aim to model the pose-to-pose and pose-to-image relations, respectively.
arXiv Detail & Related papers (2022-11-12T18:27:00Z) - L2CS-Net: Fine-Grained Gaze Estimation in Unconstrained Environments [2.5234156040689237]
We propose a robust CNN-based model for predicting gaze in unconstrained settings.
We use two identical losses, one for each angle, to improve network learning and increase its generalization.
Our proposed model achieves state-of-the-art accuracy of 3.92deg and 10.41deg on MPIIGaze and Gaze360 datasets, respectively.
arXiv Detail & Related papers (2022-03-07T12:35:39Z) - Combining Attention with Flow for Person Image Synthesis [55.670135403481275]
We propose a novel model by combining the attention operation with the flow-based operation.
Our model not only takes the advantage of the attention operation to generate accurate target structures but also uses the flow-based operation to sample realistic source textures.
arXiv Detail & Related papers (2021-08-04T03:05:39Z) - Adaptive Feature Fusion Network for Gaze Tracking in Mobile Tablets [19.739595664816164]
We propose a novel Adaptive Feature Fusion Network (AFF-Net), which performs gaze tracking task in mobile tablets.
We use Squeeze-and-Excitation layers to adaptively fuse two-eye features according to their similarity on appearance.
Experiments on both GazeCapture and MPIIFaceGaze datasets demonstrate consistently superior performance of the proposed method.
arXiv Detail & Related papers (2021-03-20T07:16:10Z) - One-shot Face Reenactment Using Appearance Adaptive Normalization [30.615671641713945]
The paper proposes a novel generative adversarial network for one-shot face reenactment.
It can animate a single face image to a different pose-and-expression while keeping its original appearance.
arXiv Detail & Related papers (2021-02-08T03:36:30Z) - LNSMM: Eye Gaze Estimation With Local Network Share Multiview Multitask [7.065909514483728]
We propose a novel methodology to estimate eye gaze points and eye gaze directions simultaneously.
The experiment show our method is state-of-the-art the current mainstream methods on two indicators of gaze points and gaze directions.
arXiv Detail & Related papers (2021-01-18T15:14:24Z) - Coarse-to-Fine Gaze Redirection with Numerical and Pictorial Guidance [74.27389895574422]
We propose a novel gaze redirection framework which exploits both a numerical and a pictorial direction guidance.
The proposed method outperforms the state-of-the-art approaches in terms of both image quality and redirection precision.
arXiv Detail & Related papers (2020-04-07T01:17:27Z) - It's Written All Over Your Face: Full-Face Appearance-Based Gaze
Estimation [82.16380486281108]
We propose an appearance-based method that only takes the full face image as input.
Our method encodes the face image using a convolutional neural network with spatial weights applied on the feature maps.
We show that our full-face method significantly outperforms the state of the art for both 2D and 3D gaze estimation.
arXiv Detail & Related papers (2016-11-27T15:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.