Related papers: LatentGaze: Cross-Domain Gaze Estimation through Gaze-Aware Analytic Latent Code Manipulation

LatentGaze: Cross-Domain Gaze Estimation through Gaze-Aware Analytic Latent Code Manipulation

URL: http://arxiv.org/abs/2209.10171v1
Date: Wed, 21 Sep 2022 08:05:53 GMT
Title: LatentGaze: Cross-Domain Gaze Estimation through Gaze-Aware Analytic Latent Code Manipulation
Authors: Isack Lee, Jun-Seok Yun, Hee Hyeon Kim, Youngju Na, Seok Bong Yoo
Abstract summary: We propose a gaze-aware analytic manipulation method, based on a data-driven approach with generative adversarial network inversion's disentanglement characteristics. By utilizing GAN-based encoder-generator process, we shift the input image from the target domain to the source domain image, which a gaze estimator is sufficiently aware.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Although recent gaze estimation methods lay great emphasis on attentively extracting gaze-relevant features from facial or eye images, how to define features that include gaze-relevant components has been ambiguous. This obscurity makes the model learn not only gaze-relevant features but also irrelevant ones. In particular, it is fatal for the cross-dataset performance. To overcome this challenging issue, we propose a gaze-aware analytic manipulation method, based on a data-driven approach with generative adversarial network inversion's disentanglement characteristics, to selectively utilize gaze-relevant features in a latent code. Furthermore, by utilizing GAN-based encoder-generator process, we shift the input image from the target domain to the source domain image, which a gaze estimator is sufficiently aware. In addition, we propose gaze distortion loss in the encoder that prevents the distortion of gaze information. The experimental results demonstrate that our method achieves state-of-the-art gaze estimation accuracy in a cross-domain gaze estimation tasks. This code is available at https://github.com/leeisack/LatentGaze/.

Related papers

Rethinking Contrastive Learning in Graph Anomaly Detection: A Clean-View Perspective [54.605073936695575]
Graph anomaly detection aims to identify unusual patterns in graph-based data, with wide applications in fields such as web security and financial fraud detection.<n>Existing methods rely on contrastive learning, assuming that a lower similarity between a node and its local subgraph indicates abnormality.<n>The presence of interfering edges invalidates this assumption, since it introduces disruptive noise that compromises the contrastive learning process.<n>We propose a Clean-View Enhanced Graph Anomaly Detection framework (CVGAD), which includes a multi-scale anomaly awareness module to identify key sources of interference in the contrastive learning process.
arXiv Detail & Related papers (2025-05-23T15:05:56Z)
Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models. In this paper, we investigate how detection performance varies across model backbones, types, and datasets. We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
Gazing Into Missteps: Leveraging Eye-Gaze for Unsupervised Mistake Detection in Egocentric Videos of Skilled Human Activities [25.049754180292034]
We address the challenge of unsupervised mistake detection in egocentric video through the analysis of gaze signals. Based on the observation that eye movements closely follow object manipulation activities, we assess to what extent eye-gaze signals can support mistake detection. Inconsistencies between predicted and observed gaze trajectories act as an indicator to identify mistakes.
arXiv Detail & Related papers (2024-06-12T16:29:45Z)
Learning Gaze-aware Compositional GAN [30.714854907472333]
We present a generative framework to create annotated gaze data by leveraging the benefits of labeled and unlabeled data sources. We show our approach's effectiveness in generating within-domain image augmentations in the ETH-XGaze dataset and cross-domain augmentations in the CelebAMask-HQ dataset domain for gaze estimation training.
arXiv Detail & Related papers (2024-05-31T07:07:54Z)
Modeling State Shifting via Local-Global Distillation for Event-Frame Gaze Tracking [61.44701715285463]
This paper tackles the problem of passive gaze estimation using both event and frame data. We reformulate gaze estimation as the quantification of the state shifting from the current state to several prior registered anchor states. To improve the generalization ability, instead of learning a large gaze estimation network directly, we align a group of local experts with a student network.
arXiv Detail & Related papers (2024-03-31T03:30:37Z)
Geo-Localization Based on Dynamically Weighted Factor-Graph [74.75763142610717]
Feature-based geo-localization relies on associating features extracted from aerial imagery with those detected by the vehicle's sensors. This requires that the type of landmarks must be observable from both sources. We present a dynamically weighted factor graph model for the vehicle's trajectory estimation.
arXiv Detail & Related papers (2023-11-13T12:44:14Z)
Bridging the Gap: Gaze Events as Interpretable Concepts to Explain Deep Neural Sequence Models [0.7829352305480283]
In this work, we employ established gaze event detection algorithms for fixations and saccades. We quantitatively evaluate the impact of these events by determining their concept influence.
arXiv Detail & Related papers (2023-04-12T10:15:31Z)
3DGazeNet: Generalizing Gaze Estimation with Weak-Supervision from Synthetic Views [67.00931529296788]
We propose to train general gaze estimation models which can be directly employed in novel environments without adaptation. We create a large-scale dataset of diverse faces with gaze pseudo-annotations, which we extract based on the 3D geometry of the scene. We test our method in the task of gaze generalization, in which we demonstrate improvement of up to 30% compared to state-of-the-art when no ground truth data are available.
arXiv Detail & Related papers (2022-12-06T14:15:17Z)
Jitter Does Matter: Adapting Gaze Estimation to New Domains [12.482427155726413]
We propose to utilize gaze jitter to analyze and optimize gaze domain adaptation task. We find that the high-frequency component (HFC) is an important factor that leads to jitter. We employ contrastive learning to encourage the model to obtain similar representations between original and perturbed data.
arXiv Detail & Related papers (2022-10-05T08:20:41Z)
Active Gaze Control for Foveal Scene Exploration [124.11737060344052]
We propose a methodology to emulate how humans and robots with foveal cameras would explore a scene. The proposed method achieves an increase in detection F1-score of 2-3 percentage points for the same number of gaze shifts.
arXiv Detail & Related papers (2022-08-24T14:59:28Z)
Weakly-Supervised Physically Unconstrained Gaze Estimation [80.66438763587904]
We tackle the previously unexplored problem of weakly-supervised gaze estimation from videos of human interactions. We propose a training algorithm along with several novel loss functions especially designed for the task. We show significant improvements in (a) the accuracy of semi-supervised gaze estimation and (b) cross-domain generalization on the state-of-the-art physically unconstrained in-the-wild Gaze360 gaze estimation benchmark.
arXiv Detail & Related papers (2021-05-20T14:58:52Z)
PureGaze: Purifying Gaze Feature for Generalizable Gaze Estimation [12.076469954457007]
We tackle the domain generalization problem in cross-domain gaze estimation for unknown target domains. To be specific, we realize the domain generalization by gaze feature purification. We design a plug-and-play self-adversarial framework for the gaze feature purification.
arXiv Detail & Related papers (2021-03-24T13:22:00Z)
Towards End-to-end Video-based Eye-Tracking [50.0630362419371]
Estimating eye-gaze from images alone is a challenging task due to un-observable person-specific factors. We propose a novel dataset and accompanying method which aims to explicitly learn these semantic and temporal relationships. We demonstrate that the fusion of information from visual stimuli as well as eye images can lead towards achieving performance similar to literature-reported figures.
arXiv Detail & Related papers (2020-07-26T12:39:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.