Similarity Contrastive Estimation for Image and Video Soft Contrastive
Self-Supervised Learning
- URL: http://arxiv.org/abs/2212.11187v1
- Date: Wed, 21 Dec 2022 16:56:55 GMT
- Title: Similarity Contrastive Estimation for Image and Video Soft Contrastive
Self-Supervised Learning
- Authors: Julien Denize, Jaonary Rabarisoa, Astrid Orcesi, Romain H\'erault
- Abstract summary: We propose a novel formulation of contrastive learning using semantic similarity between instances.
Our training objective is a soft contrastive one that brings the positives closer and estimates a continuous distribution to push or pull negative instances.
We show that SCE reaches state-of-the-art results for pretraining video representation and that the learned representation can generalize to video downstream tasks.
- Score: 0.22940141855172028
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive representation learning has proven to be an effective
self-supervised learning method for images and videos. Most successful
approaches are based on Noise Contrastive Estimation (NCE) and use different
views of an instance as positives that should be contrasted with other
instances, called negatives, that are considered as noise. However, several
instances in a dataset are drawn from the same distribution and share
underlying semantic information. A good data representation should contain
relations between the instances, or semantic similarity and dissimilarity, that
contrastive learning harms by considering all negatives as noise. To circumvent
this issue, we propose a novel formulation of contrastive learning using
semantic similarity between instances called Similarity Contrastive Estimation
(SCE). Our training objective is a soft contrastive one that brings the
positives closer and estimates a continuous distribution to push or pull
negative instances based on their learned similarities. We validate empirically
our approach on both image and video representation learning. We show that SCE
performs competitively with the state of the art on the ImageNet linear
evaluation protocol for fewer pretraining epochs and that it generalizes to
several downstream image tasks. We also show that SCE reaches state-of-the-art
results for pretraining video representation and that the learned
representation can generalize to video downstream tasks.
Related papers
- Soft Neighbors are Positive Supporters in Contrastive Visual
Representation Learning [35.53729744330751]
Contrastive learning methods train visual encoders by comparing views from one instance to others.
This binary instance discrimination is studied extensively to improve feature representations in self-supervised learning.
In this paper, we rethink the instance discrimination framework and find the binary instance labeling insufficient to measure correlations between different samples.
arXiv Detail & Related papers (2023-03-30T04:22:07Z) - Modulated Contrast for Versatile Image Synthesis [60.304183493234376]
MoNCE is a versatile metric that introduces image contrast to learn a calibrated metric for the perception of multifaceted inter-image distances.
We introduce optimal transport in MoNCE to modulate the pushing force of negative samples collaboratively across multiple contrastive objectives.
arXiv Detail & Related papers (2022-03-17T14:03:46Z) - Robust Contrastive Learning against Noisy Views [79.71880076439297]
We propose a new contrastive loss function that is robust against noisy views.
We show that our approach provides consistent improvements over the state-of-the-art image, video, and graph contrastive learning benchmarks.
arXiv Detail & Related papers (2022-01-12T05:24:29Z) - Similarity Contrastive Estimation for Self-Supervised Soft Contrastive
Learning [0.41998444721319206]
We argue that a good data representation contains the relations, or semantic similarity, between the instances.
We propose a novel formulation of contrastive learning using semantic similarity between instances called Similarity Contrastive Estimation (SCE)
Our training objective can be considered as soft contrastive learning.
arXiv Detail & Related papers (2021-11-29T15:19:15Z) - Investigating the Role of Negatives in Contrastive Representation
Learning [59.30700308648194]
Noise contrastive learning is a popular technique for unsupervised representation learning.
We focus on disambiguating the role of one of these parameters: the number of negative examples.
We find that the results broadly agree with our theory, while our vision experiments are murkier with performance sometimes even being insensitive to the number of negatives.
arXiv Detail & Related papers (2021-06-18T06:44:16Z) - Contrastive Learning of Image Representations with Cross-Video
Cycle-Consistency [13.19476138523546]
Cross-video relation has barely been explored for visual representation learning.
We propose a novel contrastive learning method which explores the cross-video relation by using cycle-consistency for general image representation learning.
We show significant improvement over state-of-the-art contrastive learning methods.
arXiv Detail & Related papers (2021-05-13T17:59:11Z) - CoCon: Cooperative-Contrastive Learning [52.342936645996765]
Self-supervised visual representation learning is key for efficient video analysis.
Recent success in learning image representations suggests contrastive learning is a promising framework to tackle this challenge.
We introduce a cooperative variant of contrastive learning to utilize complementary information across views.
arXiv Detail & Related papers (2021-04-30T05:46:02Z) - Robust Audio-Visual Instance Discrimination [79.74625434659443]
We present a self-supervised learning method to learn audio and video representations.
We address the problems of audio-visual instance discrimination and improve transfer learning performance.
arXiv Detail & Related papers (2021-03-29T19:52:29Z) - Whitening for Self-Supervised Representation Learning [129.57407186848917]
We propose a new loss function for self-supervised representation learning (SSL) based on the whitening of latent-space features.
Our solution does not require asymmetric networks and it is conceptually simple.
arXiv Detail & Related papers (2020-07-13T12:33:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.