SLYKLatent: A Learning Framework for Gaze Estimation Using Deep Facial Feature Learning
- URL: http://arxiv.org/abs/2402.01555v2
- Date: Wed, 13 Nov 2024 11:44:10 GMT
- Title: SLYKLatent: A Learning Framework for Gaze Estimation Using Deep Facial Feature Learning
- Authors: Samuel Adebayo, Joost C. Dessing, Seán McLoone,
- Abstract summary: We present SLYKLatent, a novel approach for enhancing gaze estimation by addressing appearance instability challenges in datasets.
SLYKLatent utilizes Self-Supervised Learning for initial training with facial expression datasets, followed by refinement with a patch-based tri-branch network.
Our evaluation on benchmark datasets achieves a 10.9% improvement on Gaze360, supersedes top MPIIFaceGaze results with 3.8%, and leads on a subset of ETH-XGaze by 11.6%.
- Score: 0.0
- License:
- Abstract: In this research, we present SLYKLatent, a novel approach for enhancing gaze estimation by addressing appearance instability challenges in datasets due to aleatoric uncertainties, covariant shifts, and test domain generalization. SLYKLatent utilizes Self-Supervised Learning for initial training with facial expression datasets, followed by refinement with a patch-based tri-branch network and an inverse explained variance-weighted training loss function. Our evaluation on benchmark datasets achieves a 10.9% improvement on Gaze360, supersedes top MPIIFaceGaze results with 3.8%, and leads on a subset of ETH-XGaze by 11.6%, surpassing existing methods by significant margins. Adaptability tests on RAF-DB and Affectnet show 86.4% and 60.9% accuracies, respectively. Ablation studies confirm the effectiveness of SLYKLatent's novel components.
Related papers
- A Self-Supervised Framework for Improved Generalisability in Ultrasound B-mode Image Segmentation [0.2556201059248933]
We introduce a contrastive SSL approach tailored for B-mode US images, incorporating a novel Relation Contrastive Loss (RCL)
Our approach significantly outperforms traditional supervised segmentation methods across three public breast US datasets.
Our research highlights that domain-inspired SSL can improve US segmentation, especially under data-limited conditions.
arXiv Detail & Related papers (2025-02-04T17:06:41Z) - INTACT: Inducing Noise Tolerance through Adversarial Curriculum Training for LiDAR-based Safety-Critical Perception and Autonomy [0.4124847249415279]
We present a novel framework designed to enhance the robustness of deep neural networks (DNNs) against noisy LiDAR data.
IntACT combines meta-learning with adversarial curriculum training (ACT) to address challenges posed by data corruption and sparsity in 3D point clouds.
IntACT's effectiveness is demonstrated through comprehensive evaluations on object detection, tracking, and classification benchmarks.
arXiv Detail & Related papers (2025-02-04T00:02:16Z) - Multiple Instance Learning with random sampling for Whole Slide Image
Classification [0.0]
Random sampling of patches during training is computationally efficient and serves as a regularization strategy.
We find optimal performance enhancement of 1.7% using thirty percent of patches on the CAMELYON16 dataset, and 3.7% with only eight samples on the TUPAC16 dataset.
We also find interpretability effects are strongly dataset-dependent, with interpretability impacted on CAMELYON16, while remaining unaffected on TUPAC16.
arXiv Detail & Related papers (2024-03-08T14:31:40Z) - BAL: Balancing Diversity and Novelty for Active Learning [53.289700543331925]
We introduce a novel framework, Balancing Active Learning (BAL), which constructs adaptive sub-pools to balance diverse and uncertain data.
Our approach outperforms all established active learning methods on widely recognized benchmarks by 1.20%.
arXiv Detail & Related papers (2023-12-26T08:14:46Z) - Robust Uncertainty Estimation for Classification of Maritime Objects [0.34998703934432673]
We present a method joining the intra-class uncertainty achieved using Monte Carlo Dropout to gain more holistic uncertainty measures.
Our work improves the FPR95 by 8% compared to the current highest-performing work when the models are trained without out-of-distribution data.
We release the SHIPS dataset and show the effectiveness of our method by improving the FPR95 by 44.2% with respect to the baseline.
arXiv Detail & Related papers (2023-07-03T19:54:53Z) - Patch-Level Contrasting without Patch Correspondence for Accurate and
Dense Contrastive Representation Learning [79.43940012723539]
ADCLR is a self-supervised learning framework for learning accurate and dense vision representation.
Our approach achieves new state-of-the-art performance for contrastive methods.
arXiv Detail & Related papers (2023-06-23T07:38:09Z) - Learning Diversified Feature Representations for Facial Expression
Recognition in the Wild [97.14064057840089]
We propose a mechanism to diversify the features extracted by CNN layers of state-of-the-art facial expression recognition architectures.
Experimental results on three well-known facial expression recognition in-the-wild datasets, AffectNet, FER+, and RAF-DB, show the effectiveness of our method.
arXiv Detail & Related papers (2022-10-17T19:25:28Z) - Training Strategies for Improved Lip-reading [61.661446956793604]
We investigate the performance of state-of-the-art data augmentation approaches, temporal models and other training strategies.
A combination of all the methods results in a classification accuracy of 93.4%, which is an absolute improvement of 4.6% over the current state-of-the-art performance.
An error analysis of the various training strategies reveals that the performance improves by increasing the classification accuracy of hard-to-recognise words.
arXiv Detail & Related papers (2022-09-03T09:38:11Z) - A new weakly supervised approach for ALS point cloud semantic
segmentation [1.4620086904601473]
We propose a deep-learning based weakly supervised framework for semantic segmentation of ALS point clouds.
We exploit potential information from unlabeled data subject to incomplete and sparse labels.
Our method achieves an overall accuracy of 83.0% and an average F1 score of 70.0%, which have increased by 6.9% and 12.8% respectively.
arXiv Detail & Related papers (2021-10-04T14:00:23Z) - Tactile Grasp Refinement using Deep Reinforcement Learning and Analytic
Grasp Stability Metrics [70.65363356763598]
We show that analytic grasp stability metrics constitute powerful optimization objectives for reinforcement learning algorithms.
We show that a combination of geometric and force-agnostic grasp stability metrics yields the highest average success rates of 95.4% for cuboids.
In a second experiment, we show that grasp refinement algorithms trained with contact feedback perform up to 6.6% better than a baseline that receives no tactile information.
arXiv Detail & Related papers (2021-09-23T09:20:19Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.