Towards a Unified Theoretical Understanding of Non-contrastive Learning
via Rank Differential Mechanism
- URL: http://arxiv.org/abs/2303.02387v1
- Date: Sat, 4 Mar 2023 11:36:41 GMT
- Title: Towards a Unified Theoretical Understanding of Non-contrastive Learning
via Rank Differential Mechanism
- Authors: Zhijian Zhuo, Yifei Wang, Jinwen Ma, Yisen Wang
- Abstract summary: A variety of methods under the name of non-contrastive learning (like BYOL, SimSiam, SwAV, DINO) show that aligning positive pairs alone is sufficient to attain good performance in self-supervised visual learning.
We propose a unified theoretical understanding for existing variants of non-contrastive learning.
Our theory named Rank Differential Mechanism (RDM) shows that all these asymmetric designs create a consistent rank difference in their dual-branch output features.
- Score: 26.17829763295478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, a variety of methods under the name of non-contrastive learning
(like BYOL, SimSiam, SwAV, DINO) show that when equipped with some asymmetric
architectural designs, aligning positive pairs alone is sufficient to attain
good performance in self-supervised visual learning. Despite some
understandings of some specific modules (like the predictor in BYOL), there is
yet no unified theoretical understanding of how these seemingly different
asymmetric designs can all avoid feature collapse, particularly considering
methods that also work without the predictor (like DINO). In this work, we
propose a unified theoretical understanding for existing variants of
non-contrastive learning. Our theory named Rank Differential Mechanism (RDM)
shows that all these asymmetric designs create a consistent rank difference in
their dual-branch output features. This rank difference will provably lead to
an improvement of effective dimensionality and alleviate either complete or
dimensional feature collapse. Different from previous theories, our RDM theory
is applicable to different asymmetric designs (with and without the predictor),
and thus can serve as a unified understanding of existing non-contrastive
learning methods. Besides, our RDM theory also provides practical guidelines
for designing many new non-contrastive variants. We show that these variants
indeed achieve comparable performance to existing methods on benchmark
datasets, and some of them even outperform the baselines. Our code is available
at \url{https://github.com/PKU-ML/Rank-Differential-Mechanism}.
Related papers
- Learning Neural Strategy-Proof Matching Mechanism from Examples [24.15688619889342]
We develop a novel attention-based neural network called NeuralSD, which can learn a strategy-proof mechanism from a human-crafted dataset.
We conducted experiments to learn a strategy-proof matching from matching examples with different numbers of agents.
arXiv Detail & Related papers (2024-10-25T08:34:25Z) - A Canonicalization Perspective on Invariant and Equivariant Learning [54.44572887716977]
We introduce a canonicalization perspective that provides an essential and complete view of the design of frames.
We show that there exists an inherent connection between frames and canonical forms.
We design novel frames for eigenvectors that are strictly superior to existing methods.
arXiv Detail & Related papers (2024-05-28T17:22:15Z) - Learning Probabilistic Symmetrization for Architecture Agnostic Equivariance [16.49488981364657]
We present a novel framework to overcome the limitations of equivariant architectures in learning functions with group symmetries.
We use an arbitrary base model such as anvariant or a transformer and symmetrize it to be equivariant to the given group.
Empirical tests show competitive results against tailored equivariant architectures.
arXiv Detail & Related papers (2023-06-05T13:40:54Z) - Evaluating the Robustness of Interpretability Methods through
Explanation Invariance and Equivariance [72.50214227616728]
Interpretability methods are valuable only if their explanations faithfully describe the explained model.
We consider neural networks whose predictions are invariant under a specific symmetry group.
arXiv Detail & Related papers (2023-04-13T17:59:03Z) - Neural Bregman Divergences for Distance Learning [60.375385370556145]
We propose a new approach to learning arbitrary Bregman divergences in a differentiable manner via input convex neural networks.
We show that our method more faithfully learns divergences over a set of both new and previously studied tasks.
Our tests further extend to known asymmetric, but non-Bregman tasks, where our method still performs competitively despite misspecification.
arXiv Detail & Related papers (2022-06-09T20:53:15Z) - On the duality between contrastive and non-contrastive self-supervised
learning [0.0]
Self-supervised learning can be divided into contrastive and non-contrastive approaches.
We show how close the contrastive and non-contrastive families can be.
We also show the influence (or lack thereof) of design choices on downstream performance.
arXiv Detail & Related papers (2022-06-03T08:04:12Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - On the Importance of Asymmetry for Siamese Representation Learning [53.86929387179092]
Siamese networks are conceptually symmetric with two parallel encoders.
We study the importance of asymmetry by explicitly distinguishing the two encoders within the network.
We find the improvements from asymmetric designs generalize well to longer training schedules, multiple other frameworks and newer backbones.
arXiv Detail & Related papers (2022-04-01T17:57:24Z) - Chaos is a Ladder: A New Theoretical Understanding of Contrastive
Learning via Augmentation Overlap [64.60460828425502]
We propose a new guarantee on the downstream performance of contrastive learning.
Our new theory hinges on the insight that the support of different intra-class samples will become more overlapped under aggressive data augmentations.
We propose an unsupervised model selection metric ARC that aligns well with downstream accuracy.
arXiv Detail & Related papers (2022-03-25T05:36:26Z) - Exploring the Equivalence of Siamese Self-Supervised Learning via A
Unified Gradient Framework [43.76337849044254]
Self-supervised learning has shown its great potential to extract powerful visual representations without human annotations.
Various works are proposed to deal with self-supervised learning from different perspectives.
We propose UniGrad, a simple but effective gradient form for self-supervised learning.
arXiv Detail & Related papers (2021-12-09T18:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.