Related papers: DVGaze: Dual-View Gaze Estimation

DVGaze: Dual-View Gaze Estimation

URL: http://arxiv.org/abs/2308.10310v1
Date: Sun, 20 Aug 2023 16:14:22 GMT
Title: DVGaze: Dual-View Gaze Estimation
Authors: Yihua Cheng and Feng Lu
Abstract summary: We propose a dual-view gaze estimation network (DV-Gaze) for gaze estimation. DV-Gaze achieves state-of-the-art performance on ETH-XGaze and EVE datasets.
Score: 13.3539097295729
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gaze estimation methods estimate gaze from facial appearance with a single camera. However, due to the limited view of a single camera, the captured facial appearance cannot provide complete facial information and thus complicate the gaze estimation problem. Recently, camera devices are rapidly updated. Dual cameras are affordable for users and have been integrated in many devices. This development suggests that we can further improve gaze estimation performance with dual-view gaze estimation. In this paper, we propose a dual-view gaze estimation network (DV-Gaze). DV-Gaze estimates dual-view gaze directions from a pair of images. We first propose a dual-view interactive convolution (DIC) block in DV-Gaze. DIC blocks exchange dual-view information during convolution in multiple feature scales. It fuses dual-view features along epipolar lines and compensates for the original feature with the fused feature. We further propose a dual-view transformer to estimate gaze from dual-view features. Camera poses are encoded to indicate the position information in the transformer. We also consider the geometric relation between dual-view gaze directions and propose a dual-view gaze consistency loss for DV-Gaze. DV-Gaze achieves state-of-the-art performance on ETH-XGaze and EVE datasets. Our experiments also prove the potential of dual-view gaze estimation. We release codes in https://github.com/yihuacheng/DVGaze.

Related papers

MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds [56.77548728485841]
We propose a fast single-stage feed-forward network MV-DUSt3R to handle more views, reduce errors, and improve inference time. At its core are multi-view decoder blocks which exchange information across any number of views while considering one reference view. To make our method robust to reference view selection, we further propose MV-DUSt3R+, which employs cross-reference-view blocks to fuse information across different reference view choices.
arXiv Detail & Related papers (2024-12-09T20:34:55Z)
GazeGen: Gaze-Driven User Interaction for Visual Content Generation [11.03973723295504]
We present GazeGen, a user interaction system that generates visual content (images and videos) for locations indicated by the user's eye gaze. Using advanced techniques in object detection and generative AI, GazeGen performs gaze-controlled image adding/deleting, repositioning, and surface style changes of image objects, and converts static images into videos. Central to GazeGen is the DFT Gaze agent, an ultra-lightweight model with only 281K parameters, performing accurate real-time gaze predictions tailored to individual users' eyes on small edge devices.
arXiv Detail & Related papers (2024-11-07T00:22:38Z)
Merging Multiple Datasets for Improved Appearance-Based Gaze Estimation [10.682719521609743]
Two-stage Transformer-based Gaze-feature Fusion (TTGF) method uses transformers to merge information from each eye and the face separately and then merge across the two eyes. Our proposed Gaze Adaptation Module (GAM) method handles annotation inconsis-tency by applying a Gaze Adaption Module for each dataset to correct gaze estimates from a single shared estimator.
arXiv Detail & Related papers (2024-09-02T02:51:40Z)
What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation [18.155092199205907]
In this paper, we present three novel elements to advance in-vehicle gaze research. First, we introduce IVGaze, a pioneering dataset capturing in-vehicle gaze. Second, our research focuses on in-vehicle gaze estimation leveraging the IVGaze. Third, we explore a novel strategy for gaze zone classification by extending the GazeDPTR.
arXiv Detail & Related papers (2024-03-23T01:22:15Z)
UVAGaze: Unsupervised 1-to-2 Views Adaptation for Gaze Estimation [10.412375913640224]
We propose a novel 1-view-to-2-views (1-to-2 views) adaptation solution for gaze estimation. Our method adapts a traditional single-view gaze estimator for flexibly placed dual cameras. Experiments show that a single-view estimator, when adapted for dual views, can achieve much higher accuracy, especially in cross-dataset settings.
arXiv Detail & Related papers (2023-12-25T08:13:28Z)
Two-level Data Augmentation for Calibrated Multi-view Detection [51.5746691103591]
We introduce a new multi-view data augmentation pipeline that preserves alignment among views. We also propose a second level of augmentation applied directly at the scene level. When combined with our simple multi-view detection model, our two-level augmentation pipeline outperforms all existing baselines.
arXiv Detail & Related papers (2022-10-19T17:55:13Z)
GazeOnce: Real-Time Multi-Person Gaze Estimation [18.16091280655655]
Appearance-based gaze estimation aims to predict the 3D eye gaze direction from a single image. Recent deep learning-based approaches have demonstrated excellent performance, but cannot output multi-person gaze in real time. We propose GazeOnce, which is capable of simultaneously predicting gaze directions for multiple faces in an image.
arXiv Detail & Related papers (2022-04-20T14:21:47Z)
Novel View Video Prediction Using a Dual Representation [51.58657840049716]
Given a set of input video clips from a single/multiple views, our network is able to predict the video from a novel view. The proposed approach does not require any priors and is able to predict the video from wider angular distances, upto 45 degree. A comparison with the State-of-the-art novel view video prediction methods shows an improvement of 26.1% in SSIM, 13.6% in PSNR, and 60% inFVD scores without using explicit priors from target views.
arXiv Detail & Related papers (2021-06-07T20:41:33Z)
Weakly-Supervised Physically Unconstrained Gaze Estimation [80.66438763587904]
We tackle the previously unexplored problem of weakly-supervised gaze estimation from videos of human interactions. We propose a training algorithm along with several novel loss functions especially designed for the task. We show significant improvements in (a) the accuracy of semi-supervised gaze estimation and (b) cross-domain generalization on the state-of-the-art physically unconstrained in-the-wild Gaze360 gaze estimation benchmark.
arXiv Detail & Related papers (2021-05-20T14:58:52Z)
Dual In-painting Model for Unsupervised Gaze Correction and Animation in the Wild [82.42401132933462]
We present a solution that works without the need for precise annotations of the gaze angle and the head pose. Our method consists of three novel modules: the Gaze Correction module (GCM), the Gaze Animation module (GAM), and the Pretrained Autoencoder module (PAM)
arXiv Detail & Related papers (2020-08-09T23:14:16Z)
ETH-XGaze: A Large Scale Dataset for Gaze Estimation under Extreme Head Pose and Gaze Variation [52.5465548207648]
ETH-XGaze is a new gaze estimation dataset consisting of over one million high-resolution images of varying gaze under extreme head poses. We show that our dataset can significantly improve the robustness of gaze estimation methods across different head poses and gaze angles.
arXiv Detail & Related papers (2020-07-31T04:15:53Z)
Coarse-to-Fine Gaze Redirection with Numerical and Pictorial Guidance [74.27389895574422]
We propose a novel gaze redirection framework which exploits both a numerical and a pictorial direction guidance. The proposed method outperforms the state-of-the-art approaches in terms of both image quality and redirection precision.
arXiv Detail & Related papers (2020-04-07T01:17:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.