Gaze Estimation with an Ensemble of Four Architectures
- URL: http://arxiv.org/abs/2107.01980v1
- Date: Mon, 5 Jul 2021 12:40:26 GMT
- Title: Gaze Estimation with an Ensemble of Four Architectures
- Authors: Xin Cai, Boyu Chen, Jiabei Zeng, Jiajun Zhang, Yunjia Sun, Xiao Wang,
Zhilong Ji, Xiao Liu, Xilin Chen, Shiguang Shan
- Abstract summary: We train several gaze estimators adopting four different network architectures.
We select the best six estimators and ensemble their predictions through a linear combination.
The method ranks the first on the leader-board of ETH-XGaze Competition, achieving an average angular error of $3.11circ$ on the ETH-XGaze test set.
- Score: 116.53389064096139
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a method for gaze estimation according to face images. We
train several gaze estimators adopting four different network architectures,
including an architecture designed for gaze estimation (i.e.,iTracker-MHSA) and
three originally designed for general computer vision tasks(i.e., BoTNet,
HRNet, ResNeSt). Then, we select the best six estimators and ensemble their
predictions through a linear combination. The method ranks the first on the
leader-board of ETH-XGaze Competition, achieving an average angular error of
$3.11^{\circ}$ on the ETH-XGaze test set.
Related papers
- 4D ASR: Joint Beam Search Integrating CTC, Attention, Transducer, and Mask Predict Decoders [53.297697898510194]
We propose a joint modeling scheme where four decoders share the same encoder -- we refer to this as 4D modeling.
To efficiently train the 4D model, we introduce a two-stage training strategy that stabilizes multitask learning.
In addition, we propose three novel one-pass beam search algorithms by combining three decoders.
arXiv Detail & Related papers (2024-06-05T05:18:20Z) - CrossGaze: A Strong Method for 3D Gaze Estimation in the Wild [4.089889918897877]
We propose CrossGaze, a strong baseline for gaze estimation.
Our model surpasses several state-of-the-art methods, achieving a mean angular error of 9.94 degrees.
Our proposed model serves as a strong foundation for future research and development in gaze estimation.
arXiv Detail & Related papers (2024-02-13T09:20:26Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - Investigation of Architectures and Receptive Fields for Appearance-based
Gaze Estimation [29.154335016375367]
We show that tuning a few simple parameters of a ResNet architecture can outperform most of the existing state-of-the-art methods for the gaze estimation task.
We obtain the state-of-the-art performances on three datasets with 3.64 on ETH-XGaze, 4.50 on MPIIFaceGaze, and 9.13 on Gaze360 degrees gaze estimation error.
arXiv Detail & Related papers (2023-08-18T14:41:51Z) - NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation [37.977032771941715]
We propose a novel Head-Eye redirection parametric model based on Neural Radiance Field.
Our model can decouple the face and eyes for separate neural rendering.
It can achieve the purpose of separately controlling the attributes of the face, identity, illumination, and eye gaze direction.
arXiv Detail & Related papers (2022-12-30T13:52:28Z) - 3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from
Synthetic Views [67.00931529296788]
We propose to train general gaze estimation models which can be directly employed in novel environments without adaptation.
We create a large-scale dataset of diverse faces with gaze pseudo-annotations, which we extract based on the 3D geometry of the scene.
We test our method in the task of gaze generalization, in which we demonstrate improvement of up to 30% compared to state-of-the-art when no ground truth data are available.
arXiv Detail & Related papers (2022-12-06T14:15:17Z) - RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination,
Normal and Depth [70.25160895688464]
We propose a novel neural network solution, RINDNet, to jointly detect all four types of edges.
RINDNet learns effective representations for each of them and works in three stages.
In our experiments, RINDNet yields promising results in comparison with state-of-the-art methods.
arXiv Detail & Related papers (2021-08-02T03:30:01Z) - LNSMM: Eye Gaze Estimation With Local Network Share Multiview Multitask [7.065909514483728]
We propose a novel methodology to estimate eye gaze points and eye gaze directions simultaneously.
The experiment show our method is state-of-the-art the current mainstream methods on two indicators of gaze points and gaze directions.
arXiv Detail & Related papers (2021-01-18T15:14:24Z) - ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head
Pose and Gaze Variation [52.5465548207648]
ETH-XGaze is a new gaze estimation dataset consisting of over one million high-resolution images of varying gaze under extreme head poses.
We show that our dataset can significantly improve the robustness of gaze estimation methods across different head poses and gaze angles.
arXiv Detail & Related papers (2020-07-31T04:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.