On the Importance of Asymmetry for Siamese Representation Learning
- URL: http://arxiv.org/abs/2204.00613v1
- Date: Fri, 1 Apr 2022 17:57:24 GMT
- Title: On the Importance of Asymmetry for Siamese Representation Learning
- Authors: Xiao Wang, Haoqi Fan, Yuandong Tian, Daisuke Kihara, Xinlei Chen
- Abstract summary: Siamese networks are conceptually symmetric with two parallel encoders.
We study the importance of asymmetry by explicitly distinguishing the two encoders within the network.
We find the improvements from asymmetric designs generalize well to longer training schedules, multiple other frameworks and newer backbones.
- Score: 53.86929387179092
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many recent self-supervised frameworks for visual representation learning are
based on certain forms of Siamese networks. Such networks are conceptually
symmetric with two parallel encoders, but often practically asymmetric as
numerous mechanisms are devised to break the symmetry. In this work, we conduct
a formal study on the importance of asymmetry by explicitly distinguishing the
two encoders within the network -- one produces source encodings and the other
targets. Our key insight is keeping a relatively lower variance in target than
source generally benefits learning. This is empirically justified by our
results from five case studies covering different variance-oriented designs,
and is aligned with our preliminary theoretical analysis on the baseline.
Moreover, we find the improvements from asymmetric designs generalize well to
longer training schedules, multiple other frameworks and newer backbones.
Finally, the combined effect of several asymmetric designs achieves a
state-of-the-art accuracy on ImageNet linear probing and competitive results on
downstream transfer. We hope our exploration will inspire more research in
exploiting asymmetry for Siamese representation learning.
Related papers
- Symmetry Breaking in Neural Network Optimization: Insights from Input Dimension Expansion [41.170854872233534]
We show that a simple input expansion can significantly improve network performance across various tasks.
By quantifying the degree of symmetry breaking, our work offers a practical technique for performance enhancement and a metric to guide network design.
arXiv Detail & Related papers (2024-09-10T10:36:40Z) - The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof [50.49582712378289]
We investigate the impact of neural parameter symmetries by introducing new neural network architectures.
We develop two methods, with some provable guarantees, of modifying standard neural networks to reduce parameter space symmetries.
Our experiments reveal several interesting observations on the empirical impact of parameter symmetries.
arXiv Detail & Related papers (2024-05-30T16:32:31Z) - Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Symmetry Induces Structure and Constraint of Learning [0.0]
We unveil the importance of the loss function symmetries in affecting, if not deciding, the learning behavior of machine learning models.
Common instances of mirror symmetries in deep learning include rescaling, rotation, and permutation symmetry.
We show that the theoretical framework can explain intriguing phenomena, such as the loss of plasticity and various collapse phenomena in neural networks.
arXiv Detail & Related papers (2023-09-29T02:21:31Z) - ${\rm E}(3)$-Equivariant Actor-Critic Methods for Cooperative Multi-Agent Reinforcement Learning [7.712824077083934]
We focus on exploiting Euclidean symmetries inherent in certain cooperative multi-agent reinforcement learning problems.
We design neural network architectures with symmetric constraints embedded as an inductive bias for multi-agent actor-critic methods.
arXiv Detail & Related papers (2023-08-23T00:18:17Z) - SO(2) and O(2) Equivariance in Image Recognition with
Bessel-Convolutional Neural Networks [63.24965775030674]
This work presents the development of Bessel-convolutional neural networks (B-CNNs)
B-CNNs exploit a particular decomposition based on Bessel functions to modify the key operation between images and filters.
Study is carried out to assess the performances of B-CNNs compared to other methods.
arXiv Detail & Related papers (2023-04-18T18:06:35Z) - Towards a Unified Theoretical Understanding of Non-contrastive Learning
via Rank Differential Mechanism [26.17829763295478]
A variety of methods under the name of non-contrastive learning (like BYOL, SimSiam, SwAV, DINO) show that aligning positive pairs alone is sufficient to attain good performance in self-supervised visual learning.
We propose a unified theoretical understanding for existing variants of non-contrastive learning.
Our theory named Rank Differential Mechanism (RDM) shows that all these asymmetric designs create a consistent rank difference in their dual-branch output features.
arXiv Detail & Related papers (2023-03-04T11:36:41Z) - Asymmetric Distribution Measure for Few-shot Learning [82.91276814477126]
metric-based few-shot image classification aims to measure the relations between query images and support classes.
We propose a novel Asymmetric Distribution Measure (ADM) network for few-shot learning.
We achieve $3.02%$ and $1.56%$ gains over the state-of-the-art method on the $5$-way $1$-shot task.
arXiv Detail & Related papers (2020-02-01T06:41:52Z) - Asymmetric GANs for Image-to-Image Translation [62.49892218126542]
Existing models for Generative Adversarial Networks (GANs) learn the mapping from the source domain to the target domain using a cycle-consistency loss.
We propose an AsymmetricGAN model with both translation and reconstruction generators of unequal sizes and different parameter-sharing strategy.
Experiments on both supervised and unsupervised generative tasks with 8 datasets show that AsymmetricGAN achieves superior model capacity and better generation performance.
arXiv Detail & Related papers (2019-12-14T21:24:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.