TiCo: Transformation Invariance and Covariance Contrast for
Self-Supervised Visual Representation Learning
- URL: http://arxiv.org/abs/2206.10698v2
- Date: Thu, 23 Jun 2022 17:36:11 GMT
- Title: TiCo: Transformation Invariance and Covariance Contrast for
Self-Supervised Visual Representation Learning
- Authors: Jiachen Zhu, Rafael M. Moraes, Serkan Karakulak, Vlad Sobol, Alfredo
Canziani, Yann LeCun
- Abstract summary: We present Transformation Invariance and Covariance Contrast (TiCo) for self-supervised visual representation learning.
Our method is based on maximizing the agreement among embeddings of different distorted versions of the same image.
We show that TiCo can be viewed as a variant of MoCo with an implicit memory bank of unlimited size at no extra memory cost.
- Score: 9.507070656654632
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: We present Transformation Invariance and Covariance Contrast (TiCo) for
self-supervised visual representation learning. Similar to other recent
self-supervised learning methods, our method is based on maximizing the
agreement among embeddings of different distorted versions of the same image,
which pushes the encoder to produce transformation invariant representations.
To avoid the trivial solution where the encoder generates constant vectors, we
regularize the covariance matrix of the embeddings from different images by
penalizing low rank solutions. By jointly minimizing the transformation
invariance loss and covariance contrast loss, we get an encoder that is able to
produce useful representations for downstream tasks. We analyze our method and
show that it can be viewed as a variant of MoCo with an implicit memory bank of
unlimited size at no extra memory cost. This makes our method perform better
than alternative methods when using small batch sizes. TiCo can also be seen as
a modification of Barlow Twins. By connecting the contrastive and
redundancy-reduction methods together, TiCo gives us new insights into how
joint embedding methods work.
Related papers
- Which Tokens to Use? Investigating Token Reduction in Vision
Transformers [64.99704164972513]
We study the reduction patterns of 10 different token reduction methods using four image classification datasets.
We find that the Top-K pruning method is a surprisingly strong baseline.
The similarity of reduction patterns is a moderate-to-strong proxy for model performance.
arXiv Detail & Related papers (2023-08-09T01:51:07Z) - EquiMod: An Equivariance Module to Improve Self-Supervised Learning [77.34726150561087]
Self-supervised visual representation methods are closing the gap with supervised learning performance.
These methods rely on maximizing the similarity between embeddings of related synthetic inputs created through data augmentations.
We introduce EquiMod a generic equivariance module that structures the learned latent space.
arXiv Detail & Related papers (2022-11-02T16:25:54Z) - A Reinforcement Learning Approach for Sequential Spatial Transformer
Networks [6.585049648605185]
We formulate the task as a Markovian Decision Process (MDP) and use RL to solve this sequential decision-making problem.
In our method, we are not bound to the differentiability of the sampling modules.
We design multiple experiments to verify the effectiveness of our method using cluttered MNIST and Fashion-MNIST datasets.
arXiv Detail & Related papers (2021-06-27T17:41:17Z) - VICReg: Variance-Invariance-Covariance Regularization for
Self-Supervised Learning [43.96465407127458]
We introduce VICReg, a method that explicitly avoids the collapse problem with a simple regularization term on the variance of the embeddings.
VICReg achieves results on par with the state of the art on several downstream tasks.
arXiv Detail & Related papers (2021-05-11T09:53:21Z) - Invariant Deep Compressible Covariance Pooling for Aerial Scene
Categorization [80.55951673479237]
We propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization.
We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:13:07Z) - Improving Transformation Invariance in Contrastive Representation
Learning [31.223892428863238]
We introduce a training objective for contrastive learning that uses a novel regularizer to control how the representation changes under transformation.
Second, we propose a change to how test time representations are generated by introducing a feature averaging approach that combines encodings from multiple transformations of the original input.
Third, we introduce the novel Spirograph dataset to explore our ideas in the context of a differentiable generative process with multiple downstream tasks.
arXiv Detail & Related papers (2020-10-19T13:49:29Z) - FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning [64.32306537419498]
We propose a novel learned feature-based refinement and augmentation method that produces a varied set of complex transformations.
These transformations also use information from both within-class and across-class representations that we extract through clustering.
We demonstrate that our method is comparable to current state of art for smaller datasets while being able to scale up to larger datasets.
arXiv Detail & Related papers (2020-07-16T17:55:31Z) - Meta-Learning Symmetries by Reparameterization [63.85144439337671]
We present a method for learning and encoding equivariances into networks by learning corresponding parameter sharing patterns from data.
Our experiments suggest that it can automatically learn to encode equivariances to common transformations used in image processing tasks.
arXiv Detail & Related papers (2020-07-06T17:59:54Z) - Unsupervised Learning of Visual Features by Contrasting Cluster
Assignments [57.33699905852397]
We propose an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons.
Our method simultaneously clusters the data while enforcing consistency between cluster assignments.
Our method can be trained with large and small batches and can scale to unlimited amounts of data.
arXiv Detail & Related papers (2020-06-17T14:00:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.