Your Contrastive Learning Is Secretly Doing Stochastic Neighbor
Embedding
- URL: http://arxiv.org/abs/2205.14814v2
- Date: Fri, 2 Jun 2023 08:55:43 GMT
- Title: Your Contrastive Learning Is Secretly Doing Stochastic Neighbor
Embedding
- Authors: Tianyang Hu, Zhili Liu, Fengwei Zhou, Wenjia Wang, Weiran Huang
- Abstract summary: Self-supervised contrastive learning (SSCL) has achieved great success in extracting powerful features from unlabeled data.
We contribute to the theoretical understanding of SSCL and uncover its connection to the classic data visualization method, neighbor embedding.
We provide novel analysis on domain-agnostic augmentations, implicit bias and robustness of learned features.
- Score: 12.421540007814937
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning, especially self-supervised contrastive learning (SSCL),
has achieved great success in extracting powerful features from unlabeled data.
In this work, we contribute to the theoretical understanding of SSCL and
uncover its connection to the classic data visualization method, stochastic
neighbor embedding (SNE), whose goal is to preserve pairwise distances. From
the perspective of preserving neighboring information, SSCL can be viewed as a
special case of SNE with the input space pairwise similarities specified by
data augmentation. The established correspondence facilitates deeper
theoretical understanding of learned features of SSCL, as well as
methodological guidelines for practical improvement. Specifically, through the
lens of SNE, we provide novel analysis on domain-agnostic augmentations,
implicit bias and robustness of learned features. To illustrate the practical
advantage, we demonstrate that the modifications from SNE to $t$-SNE can also
be adopted in the SSCL setting, achieving significant improvement in both
in-distribution and out-of-distribution generalization.
Related papers
- Understanding the Role of Equivariance in Self-supervised Learning [51.56331245499712]
equivariant self-supervised learning (E-SSL) learns features to be augmentation-aware.
We identify a critical explaining-away effect in E-SSL that creates a synergy between the equivariant and classification tasks.
We reveal several principles for practical designs of E-SSL.
arXiv Detail & Related papers (2024-11-10T16:09:47Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Relaxed Contrastive Learning for Federated Learning [48.96253206661268]
We propose a novel contrastive learning framework to address the challenges of data heterogeneity in federated learning.
Our framework outperforms all existing federated learning approaches by huge margins on the standard benchmarks.
arXiv Detail & Related papers (2024-01-10T04:55:24Z) - Supervised Stochastic Neighbor Embedding Using Contrastive Learning [4.560284382063488]
Clusters of samples belonging to the same class are pulled together in low-dimensional embedding space.
We extend the self-supervised contrastive approach to the fully-supervised setting, allowing us to effectively leverage label information.
arXiv Detail & Related papers (2023-09-15T00:26:21Z) - What Constitutes Good Contrastive Learning in Time-Series Forecasting? [10.44543726728613]
Self-supervised contrastive learning (SSCL) has demonstrated remarkable improvements in representation learning across various domains.
This paper aims to conduct a comprehensive analysis of the effectiveness of various SSCL algorithms, learning strategies, model architectures, and their interplay.
We demonstrate that the end-to-end training of a Transformer model using the Mean Squared Error (MSE) loss and SSCL emerges as the most effective approach in time series forecasting.
arXiv Detail & Related papers (2023-06-21T08:05:05Z) - ArCL: Enhancing Contrastive Learning with Augmentation-Robust
Representations [30.745749133759304]
We develop a theoretical framework to analyze the transferability of self-supervised contrastive learning.
We show that contrastive learning fails to learn domain-invariant features, which limits its transferability.
Based on these theoretical insights, we propose a novel method called Augmentation-robust Contrastive Learning (ArCL)
arXiv Detail & Related papers (2023-03-02T09:26:20Z) - ProCC: Progressive Cross-primitive Compatibility for Open-World
Compositional Zero-Shot Learning [29.591615811894265]
Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel compositions of state and object primitives in images with no priors on the compositional space.
We propose a novel method, termed Progressive Cross-primitive Compatibility (ProCC), to mimic the human learning process for OW-CZSL tasks.
arXiv Detail & Related papers (2022-11-19T10:09:46Z) - Knowledge Enhanced Neural Networks for relational domains [83.9217787335878]
We focus on a specific method, KENN, a Neural-Symbolic architecture that injects prior logical knowledge into a neural network.
In this paper, we propose an extension of KENN for relational data.
arXiv Detail & Related papers (2022-05-31T13:00:34Z) - Learning Where to Learn in Cross-View Self-Supervised Learning [54.14989750044489]
Self-supervised learning (SSL) has made enormous progress and largely narrowed the gap with supervised ones.
Current methods simply adopt uniform aggregation of pixels for embedding.
We present a new approach, Learning Where to Learn (LEWEL), to adaptively aggregate spatial information of features.
arXiv Detail & Related papers (2022-03-28T17:02:42Z) - Self-learn to Explain Siamese Networks Robustly [22.913886901196353]
Learning to compare two objects are used in digital forensics, face recognition, brain network analysis, especially when labeled data is scarce.
As these applications make high-stake decisions involving societal values like fairness and imbalance, it is critical to explain learned models.
arXiv Detail & Related papers (2021-09-15T15:28:39Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.