Related papers: SLYKLatent, a Learning Framework for Facial Features Estimation

SLYKLatent, a Learning Framework for Facial Features Estimation

URL: http://arxiv.org/abs/2402.01555v1
Date: Fri, 2 Feb 2024 16:47:18 GMT
Title: SLYKLatent, a Learning Framework for Facial Features Estimation
Authors: Samuel Adebayo, Joost C. Dessing, Se\'an McLoone
Abstract summary: SLYKLatent is a novel approach for enhancing gaze estimation by addressing appearance instability challenges in datasets. Our evaluation on benchmark datasets achieves an 8.7% improvement on Gaze360, rivals top MPIIFaceGaze results, and leads on a subset of ETH-XGaze by 13%.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this research, we present SLYKLatent, a novel approach for enhancing gaze estimation by addressing appearance instability challenges in datasets due to aleatoric uncertainties, covariant shifts, and test domain generalization. SLYKLatent utilizes Self-Supervised Learning for initial training with facial expression datasets, followed by refinement with a patch-based tri-branch network and an inverse explained variance-weighted training loss function. Our evaluation on benchmark datasets achieves an 8.7% improvement on Gaze360, rivals top MPIIFaceGaze results, and leads on a subset of ETH-XGaze by 13%, surpassing existing methods by significant margins. Adaptability tests on RAF-DB and Affectnet show 86.4% and 60.9% accuracies, respectively. Ablation studies confirm the effectiveness of SLYKLatent's novel components. This approach has strong potential in human-robot interaction.

Related papers

SST: Self-training with Self-adaptive Thresholding for Semi-supervised Learning [42.764994681999774]
Self-adaptive Thresholding (SST) is a novel, effective, and efficient SSL framework.<n>SST adjusts class-specific thresholds based on the model's learning progress.<n>Semi-SST-ViT-Huge achieves the best results on competitive ImageNet-1K SSL benchmarks.
arXiv Detail & Related papers (2025-05-31T08:34:04Z)
A Self-Supervised Framework for Improved Generalisability in Ultrasound B-mode Image Segmentation [0.2556201059248933]
We introduce a contrastive SSL approach tailored for B-mode US images, incorporating a novel Relation Contrastive Loss (RCL) Our approach significantly outperforms traditional supervised segmentation methods across three public breast US datasets. Our research highlights that domain-inspired SSL can improve US segmentation, especially under data-limited conditions.
arXiv Detail & Related papers (2025-02-04T17:06:41Z)
INTACT: Inducing Noise Tolerance through Adversarial Curriculum Training for LiDAR-based Safety-Critical Perception and Autonomy [0.4124847249415279]
We present a novel framework designed to enhance the robustness of deep neural networks (DNNs) against noisy LiDAR data. IntACT combines meta-learning with adversarial curriculum training (ACT) to address challenges posed by data corruption and sparsity in 3D point clouds. IntACT's effectiveness is demonstrated through comprehensive evaluations on object detection, tracking, and classification benchmarks.
arXiv Detail & Related papers (2025-02-04T00:02:16Z)
Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation [2.722220619798093]
We investigate the performance of a zero-shot anomaly segmentation algorithm by perturbing test data using three semantic transformations. We find that performance is consistently lowered on three CLIP backbones, regardless of model architecture or learning objective.
arXiv Detail & Related papers (2024-05-13T17:47:08Z)
Multiple Instance Learning with random sampling for Whole Slide Image Classification [0.0]
Random sampling of patches during training is computationally efficient and serves as a regularization strategy. We find optimal performance enhancement of 1.7% using thirty percent of patches on the CAMELYON16 dataset, and 3.7% with only eight samples on the TUPAC16 dataset. We also find interpretability effects are strongly dataset-dependent, with interpretability impacted on CAMELYON16, while remaining unaffected on TUPAC16.
arXiv Detail & Related papers (2024-03-08T14:31:40Z)
BAL: Balancing Diversity and Novelty for Active Learning [53.289700543331925]
We introduce a novel framework, Balancing Active Learning (BAL), which constructs adaptive sub-pools to balance diverse and uncertain data. Our approach outperforms all established active learning methods on widely recognized benchmarks by 1.20%.
arXiv Detail & Related papers (2023-12-26T08:14:46Z)
Robust Uncertainty Estimation for Classification of Maritime Objects [0.34998703934432673]
We present a method joining the intra-class uncertainty achieved using Monte Carlo Dropout to gain more holistic uncertainty measures. Our work improves the FPR95 by 8% compared to the current highest-performing work when the models are trained without out-of-distribution data. We release the SHIPS dataset and show the effectiveness of our method by improving the FPR95 by 44.2% with respect to the baseline.
arXiv Detail & Related papers (2023-07-03T19:54:53Z)
Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning [79.43940012723539]
ADCLR is a self-supervised learning framework for learning accurate and dense vision representation. Our approach achieves new state-of-the-art performance for contrastive methods.
arXiv Detail & Related papers (2023-06-23T07:38:09Z)
Boosting Visual-Language Models by Exploiting Hard Samples [126.35125029639168]
HELIP is a cost-effective strategy tailored to enhance the performance of existing CLIP models. Our method allows for effortless integration with existing models' training pipelines. On comprehensive benchmarks, HELIP consistently boosts existing models to achieve leading performance.
arXiv Detail & Related papers (2023-05-09T07:00:17Z)
Co-supervised learning paradigm with conditional generative adversarial networks for sample-efficient classification [8.27719348049333]
This paper introduces a sample-efficient co-supervised learning paradigm (SEC-CGAN) SEC-CGAN is trained alongside the classifier and supplements semantics-conditioned, confidence-aware synthesized examples to the annotated data during the training process. Experiments demonstrate that the proposed SEC-CGAN outperforms the external classifier GAN and a baseline ResNet-18 classifier.
arXiv Detail & Related papers (2022-12-27T19:24:31Z)
Learning Diversified Feature Representations for Facial Expression Recognition in the Wild [97.14064057840089]
We propose a mechanism to diversify the features extracted by CNN layers of state-of-the-art facial expression recognition architectures. Experimental results on three well-known facial expression recognition in-the-wild datasets, AffectNet, FER+, and RAF-DB, show the effectiveness of our method.
arXiv Detail & Related papers (2022-10-17T19:25:28Z)
Training Strategies for Improved Lip-reading [61.661446956793604]
We investigate the performance of state-of-the-art data augmentation approaches, temporal models and other training strategies. A combination of all the methods results in a classification accuracy of 93.4%, which is an absolute improvement of 4.6% over the current state-of-the-art performance. An error analysis of the various training strategies reveals that the performance improves by increasing the classification accuracy of hard-to-recognise words.
arXiv Detail & Related papers (2022-09-03T09:38:11Z)
A new weakly supervised approach for ALS point cloud semantic segmentation [1.4620086904601473]
We propose a deep-learning based weakly supervised framework for semantic segmentation of ALS point clouds. We exploit potential information from unlabeled data subject to incomplete and sparse labels. Our method achieves an overall accuracy of 83.0% and an average F1 score of 70.0%, which have increased by 6.9% and 12.8% respectively.
arXiv Detail & Related papers (2021-10-04T14:00:23Z)
Tactile Grasp Refinement using Deep Reinforcement Learning and Analytic Grasp Stability Metrics [70.65363356763598]
We show that analytic grasp stability metrics constitute powerful optimization objectives for reinforcement learning algorithms. We show that a combination of geometric and force-agnostic grasp stability metrics yields the highest average success rates of 95.4% for cuboids. In a second experiment, we show that grasp refinement algorithms trained with contact feedback perform up to 6.6% better than a baseline that receives no tactile information.
arXiv Detail & Related papers (2021-09-23T09:20:19Z)
To be Critical: Self-Calibrated Weakly Supervised Learning for Salient Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations. We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions. We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.