Contrastive Representation Learning for Gaze Estimation
- URL: http://arxiv.org/abs/2210.13404v1
- Date: Mon, 24 Oct 2022 17:01:18 GMT
- Title: Contrastive Representation Learning for Gaze Estimation
- Authors: Swati Jindal and Roberto Manduchi
- Abstract summary: We propose a contrastive representation learning framework for gaze estimation, named Gaze Contrastive Learning (GazeCLR)
Our results show that GazeCLR improves the performance of cross-domain gaze estimation and yields as high as 17.2% relative improvement.
The GazeCLR framework is competitive with state-of-the-art representation learning methods for few-shot evaluation.
- Score: 8.121462458089143
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning (SSL) has become prevalent for learning
representations in computer vision. Notably, SSL exploits contrastive learning
to encourage visual representations to be invariant under various image
transformations. The task of gaze estimation, on the other hand, demands not
just invariance to various appearances but also equivariance to the geometric
transformations. In this work, we propose a simple contrastive representation
learning framework for gaze estimation, named Gaze Contrastive Learning
(GazeCLR). GazeCLR exploits multi-view data to promote equivariance and relies
on selected data augmentation techniques that do not alter gaze directions for
invariance learning. Our experiments demonstrate the effectiveness of GazeCLR
for several settings of the gaze estimation task. Particularly, our results
show that GazeCLR improves the performance of cross-domain gaze estimation and
yields as high as 17.2% relative improvement. Moreover, the GazeCLR framework
is competitive with state-of-the-art representation learning methods for
few-shot evaluation. The code and pre-trained models are available at
https://github.com/jswati31/gazeclr.
Related papers
- Contrastive Learning Via Equivariant Representation [19.112460889771423]
We propose CLeVER, a novel equivariant contrastive learning framework compatible with augmentation strategies of arbitrary complexity.
Experimental results demonstrate that CLeVER effectively extracts and incorporates equivariant information from practical natural images.
arXiv Detail & Related papers (2024-06-01T01:53:51Z) - Decoupled Contrastive Learning for Long-Tailed Recognition [58.255966442426484]
Supervised Contrastive Loss (SCL) is popular in visual representation learning.
In the scenario of long-tailed recognition, where the number of samples in each class is imbalanced, treating two types of positive samples equally leads to the biased optimization for intra-category distance.
We propose a patch-based self distillation to transfer knowledge from head to tail classes to relieve the under-representation of tail classes.
arXiv Detail & Related papers (2024-03-10T09:46:28Z) - CLIP-Gaze: Towards General Gaze Estimation via Visual-Linguistic Model [13.890404285565225]
We propose a novel framework called CLIP-Gaze that utilizes a pre-trained vision-language model to leverage its transferable knowledge.
Our framework is the first to leverage the vision-and-language cross-modality approach for gaze estimation task.
arXiv Detail & Related papers (2024-03-08T07:37:21Z) - Semi-supervised Contrastive Regression for Estimation of Eye Gaze [0.609170287691728]
This paper develops a semi-supervised contrastive learning framework for estimation of gaze direction.
With a small labeled gaze dataset, the framework is able to find a generalized solution even for unseen face images.
Our contrastive regression framework shows good performance in comparison to several state of the art contrastive learning techniques used for gaze estimation.
arXiv Detail & Related papers (2023-08-05T04:11:38Z) - ArCL: Enhancing Contrastive Learning with Augmentation-Robust
Representations [30.745749133759304]
We develop a theoretical framework to analyze the transferability of self-supervised contrastive learning.
We show that contrastive learning fails to learn domain-invariant features, which limits its transferability.
Based on these theoretical insights, we propose a novel method called Augmentation-robust Contrastive Learning (ArCL)
arXiv Detail & Related papers (2023-03-02T09:26:20Z) - R\'enyiCL: Contrastive Representation Learning with Skew R\'enyi
Divergence [78.15455360335925]
We present a new robust contrastive learning scheme, coined R'enyiCL, which can effectively manage harder augmentations.
Our method is built upon the variational lower bound of R'enyi divergence.
We show that R'enyi contrastive learning objectives perform innate hard negative sampling and easy positive sampling simultaneously.
arXiv Detail & Related papers (2022-08-12T13:37:05Z) - Towards Self-Supervised Gaze Estimation [32.91601919228028]
We propose SwAT, an equivariant version of the online clustering-based self-supervised approach SwAV, to learn more informative representations for gaze estimation.
We achieve up to 57% and 25% improvements in cross-dataset and within-dataset evaluation tasks on existing benchmarks.
arXiv Detail & Related papers (2022-03-21T13:35:16Z) - Weak Augmentation Guided Relational Self-Supervised Learning [80.0680103295137]
We introduce a novel relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances.
Our proposed method employs sharpened distribution of pairwise similarities among different instances as textitrelation metric.
Experimental results show that our proposed ReSSL substantially outperforms the state-of-the-art methods across different network architectures.
arXiv Detail & Related papers (2022-03-16T16:14:19Z) - Why Do Self-Supervised Models Transfer? Investigating the Impact of
Invariance on Downstream Tasks [79.13089902898848]
Self-supervised learning is a powerful paradigm for representation learning on unlabelled images.
We show that different tasks in computer vision require features to encode different (in)variances.
arXiv Detail & Related papers (2021-11-22T18:16:35Z) - Weakly Supervised Contrastive Learning [68.47096022526927]
We introduce a weakly supervised contrastive learning framework (WCL) to tackle this issue.
WCL achieves 65% and 72% ImageNet Top-1 Accuracy using ResNet50, which is even higher than SimCLRv2 with ResNet101.
arXiv Detail & Related papers (2021-10-10T12:03:52Z) - ReSSL: Relational Self-Supervised Learning with Weak Augmentation [68.47096022526927]
Self-supervised learning has achieved great success in learning visual representations without data annotations.
We introduce a novel relational SSL paradigm that learns representations by modeling the relationship between different instances.
Our proposed ReSSL significantly outperforms the previous state-of-the-art algorithms in terms of both performance and training efficiency.
arXiv Detail & Related papers (2021-07-20T06:53:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.