GazeOnce: Real-Time Multi-Person Gaze Estimation
- URL: http://arxiv.org/abs/2204.09480v1
- Date: Wed, 20 Apr 2022 14:21:47 GMT
- Title: GazeOnce: Real-Time Multi-Person Gaze Estimation
- Authors: Mingfang Zhang, Yunfei Liu, Feng Lu
- Abstract summary: Appearance-based gaze estimation aims to predict the 3D eye gaze direction from a single image.
Recent deep learning-based approaches have demonstrated excellent performance, but cannot output multi-person gaze in real time.
We propose GazeOnce, which is capable of simultaneously predicting gaze directions for multiple faces in an image.
- Score: 18.16091280655655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Appearance-based gaze estimation aims to predict the 3D eye gaze direction
from a single image. While recent deep learning-based approaches have
demonstrated excellent performance, they usually assume one calibrated face in
each input image and cannot output multi-person gaze in real time. However,
simultaneous gaze estimation for multiple people in the wild is necessary for
real-world applications. In this paper, we propose the first one-stage
end-to-end gaze estimation method, GazeOnce, which is capable of simultaneously
predicting gaze directions for multiple faces (>10) in an image. In addition,
we design a sophisticated data generation pipeline and propose a new dataset,
MPSGaze, which contains full images of multiple people with 3D gaze ground
truth. Experimental results demonstrate that our unified framework not only
offers a faster speed, but also provides a lower gaze estimation error compared
with state-of-the-art methods. This technique can be useful in real-time
applications with multiple users.
Related papers
- DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360° Images [17.714378486267055]
We present DiffGaze, a novel method for generating realistic and diverse continuous human gaze sequences on 360deg images.
Our evaluations show that DiffGaze outperforms state-of-the-art methods on all tasks.
arXiv Detail & Related papers (2024-03-26T08:13:02Z) - UVAGaze: Unsupervised 1-to-2 Views Adaptation for Gaze Estimation [10.412375913640224]
We propose a novel 1-view-to-2-views (1-to-2 views) adaptation solution for gaze estimation.
Our method adapts a traditional single-view gaze estimator for flexibly placed dual cameras.
Experiments show that a single-view estimator, when adapted for dual views, can achieve much higher accuracy, especially in cross-dataset settings.
arXiv Detail & Related papers (2023-12-25T08:13:28Z) - Deceptive-NeRF/3DGS: Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction [60.52716381465063]
We introduce Deceptive-NeRF/3DGS to enhance sparse-view reconstruction with only a limited set of input images.
Specifically, we propose a deceptive diffusion model turning noisy images rendered from few-view reconstructions into high-quality pseudo-observations.
Our system progressively incorporates diffusion-generated pseudo-observations into the training image sets, ultimately densifying the sparse input observations by 5 to 10 times.
arXiv Detail & Related papers (2023-05-24T14:00:32Z) - EFE: End-to-end Frame-to-Gaze Estimation [42.61379693370926]
We propose a frame-to-gaze network that directly predicts both 3D gaze origin and 3D gaze direction from the raw frame out of the camera without any face or eye cropping.
Our method demonstrates that direct gaze regression from the raw downscaled frame, from FHD/HD to VGA/HVGA resolution, is possible despite the challenges of having very few pixels in the eye region.
arXiv Detail & Related papers (2023-05-09T15:25:45Z) - 3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from
Synthetic Views [67.00931529296788]
We propose to train general gaze estimation models which can be directly employed in novel environments without adaptation.
We create a large-scale dataset of diverse faces with gaze pseudo-annotations, which we extract based on the 3D geometry of the scene.
We test our method in the task of gaze generalization, in which we demonstrate improvement of up to 30% compared to state-of-the-art when no ground truth data are available.
arXiv Detail & Related papers (2022-12-06T14:15:17Z) - Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene.
The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z) - Weakly-Supervised Physically Unconstrained Gaze Estimation [80.66438763587904]
We tackle the previously unexplored problem of weakly-supervised gaze estimation from videos of human interactions.
We propose a training algorithm along with several novel loss functions especially designed for the task.
We show significant improvements in (a) the accuracy of semi-supervised gaze estimation and (b) cross-domain generalization on the state-of-the-art physically unconstrained in-the-wild Gaze360 gaze estimation benchmark.
arXiv Detail & Related papers (2021-05-20T14:58:52Z) - Appearance-based Gaze Estimation With Deep Learning: A Review and Benchmark [14.306488668615883]
We present a systematic review of the appearance-based gaze estimation methods using deep learning.
We summarize the data pre-processing and post-processing methods, including face/eye detection, data rectification, 2D/3D gaze conversion and gaze origin conversion.
arXiv Detail & Related papers (2021-04-26T15:53:03Z) - Unsupervised Learning on Monocular Videos for 3D Human Pose Estimation [121.5383855764944]
We use contrastive self-supervised learning to extract rich latent vectors from single-view videos.
We show that applying CSS only to the time-variant features, while also reconstructing the input and encouraging a gradual transition between nearby and away features, yields a rich latent space.
Our approach outperforms other unsupervised single-view methods and matches the performance of multi-view techniques.
arXiv Detail & Related papers (2020-12-02T20:27:35Z) - 360-Degree Gaze Estimation in the Wild Using Multiple Zoom Scales [26.36068336169795]
We develop a model that mimics humans' ability to estimate the gaze by aggregating from focused looks.
The model avoids the need to extract clear eye patches.
We extend the model to handle the challenging task of 360-degree gaze estimation.
arXiv Detail & Related papers (2020-09-15T08:45:12Z) - It's Written All Over Your Face: Full-Face Appearance-Based Gaze
Estimation [82.16380486281108]
We propose an appearance-based method that only takes the full face image as input.
Our method encodes the face image using a convolutional neural network with spatial weights applied on the feature maps.
We show that our full-face method significantly outperforms the state of the art for both 2D and 3D gaze estimation.
arXiv Detail & Related papers (2016-11-27T15:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.