3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from
Synthetic Views
- URL: http://arxiv.org/abs/2212.02997v3
- Date: Tue, 12 Dec 2023 13:39:34 GMT
- Title: 3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from
Synthetic Views
- Authors: Evangelos Ververas, Polydefkis Gkagkos, Jiankang Deng, Michail
Christos Doukas, Jia Guo, Stefanos Zafeiriou
- Abstract summary: We propose to train general gaze estimation models which can be directly employed in novel environments without adaptation.
We create a large-scale dataset of diverse faces with gaze pseudo-annotations, which we extract based on the 3D geometry of the scene.
We test our method in the task of gaze generalization, in which we demonstrate improvement of up to 30% compared to state-of-the-art when no ground truth data are available.
- Score: 67.00931529296788
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Developing gaze estimation models that generalize well to unseen domains and
in-the-wild conditions remains a challenge with no known best solution. This is
mostly due to the difficulty of acquiring ground truth data that cover the
distribution of faces, head poses, and environments that exist in the real
world. Most recent methods attempt to close the gap between specific source and
target domains using domain adaptation. In this work, we propose to train
general gaze estimation models which can be directly employed in novel
environments without adaptation. To do so, we leverage the observation that
head, body, and hand pose estimation benefit from revising them as dense 3D
coordinate prediction, and similarly express gaze estimation as regression of
dense 3D eye meshes. To close the gap between image domains, we create a
large-scale dataset of diverse faces with gaze pseudo-annotations, which we
extract based on the 3D geometry of the scene, and design a multi-view
supervision framework to balance their effect during training. We test our
method in the task of gaze generalization, in which we demonstrate improvement
of up to 30% compared to state-of-the-art when no ground truth data are
available, and up to 10% when they are. The project material are available for
research purposes at https://github.com/Vagver/3DGazeNet.
Related papers
- GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - CrossGaze: A Strong Method for 3D Gaze Estimation in the Wild [4.089889918897877]
We propose CrossGaze, a strong baseline for gaze estimation.
Our model surpasses several state-of-the-art methods, achieving a mean angular error of 9.94 degrees.
Our proposed model serves as a strong foundation for future research and development in gaze estimation.
arXiv Detail & Related papers (2024-02-13T09:20:26Z) - A Survey on 3D Gaussian Splatting [51.96747208581275]
3D Gaussian splatting (GS) has emerged as a transformative technique in the realm of explicit radiance field and computer graphics.
We provide the first systematic overview of the recent developments and critical contributions in the domain of 3D GS.
By enabling unprecedented rendering speed, 3D GS opens up a plethora of applications, ranging from virtual reality to interactive media and beyond.
arXiv Detail & Related papers (2024-01-08T13:42:59Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - Domain-Adaptive Full-Face Gaze Estimation via Novel-View-Synthesis and Feature Disentanglement [12.857137513211866]
We propose an effective model training pipeline consisting of a training data synthesis and a gaze estimation model for unsupervised domain adaptation.
The proposed data synthesis leverages the single-image 3D reconstruction to expand the range of the head poses from the source domain without requiring a 3D facial shape dataset.
We propose a disentangling autoencoder network to separate gaze-related features and introduce background augmentation consistency loss to utilize the characteristics of the synthetic source domain.
arXiv Detail & Related papers (2023-05-25T15:15:03Z) - L2CS-Net: Fine-Grained Gaze Estimation in Unconstrained Environments [2.5234156040689237]
We propose a robust CNN-based model for predicting gaze in unconstrained settings.
We use two identical losses, one for each angle, to improve network learning and increase its generalization.
Our proposed model achieves state-of-the-art accuracy of 3.92deg and 10.41deg on MPIIGaze and Gaze360 datasets, respectively.
arXiv Detail & Related papers (2022-03-07T12:35:39Z) - Weakly-Supervised Physically Unconstrained Gaze Estimation [80.66438763587904]
We tackle the previously unexplored problem of weakly-supervised gaze estimation from videos of human interactions.
We propose a training algorithm along with several novel loss functions especially designed for the task.
We show significant improvements in (a) the accuracy of semi-supervised gaze estimation and (b) cross-domain generalization on the state-of-the-art physically unconstrained in-the-wild Gaze360 gaze estimation benchmark.
arXiv Detail & Related papers (2021-05-20T14:58:52Z) - 360-Degree Gaze Estimation in the Wild Using Multiple Zoom Scales [26.36068336169795]
We develop a model that mimics humans' ability to estimate the gaze by aggregating from focused looks.
The model avoids the need to extract clear eye patches.
We extend the model to handle the challenging task of 360-degree gaze estimation.
arXiv Detail & Related papers (2020-09-15T08:45:12Z) - Learning to Detect Head Movement in Unconstrained Remote Gaze Estimation
in the Wild [19.829721663742124]
We propose end-to-end appearance-based gaze estimation methods that could more robustly incorporate different levels of head-pose representations into gaze estimation.
Our method could generalize to real-world scenarios with low image quality, different lightings and scenarios where direct head-pose information is not available.
arXiv Detail & Related papers (2020-04-07T22:38:49Z) - It's Written All Over Your Face: Full-Face Appearance-Based Gaze
Estimation [82.16380486281108]
We propose an appearance-based method that only takes the full face image as input.
Our method encodes the face image using a convolutional neural network with spatial weights applied on the feature maps.
We show that our full-face method significantly outperforms the state of the art for both 2D and 3D gaze estimation.
arXiv Detail & Related papers (2016-11-27T15:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.