Toward a Geometrical Understanding of Self-supervised Contrastive
Learning
- URL: http://arxiv.org/abs/2205.06926v1
- Date: Fri, 13 May 2022 23:24:48 GMT
- Title: Toward a Geometrical Understanding of Self-supervised Contrastive
Learning
- Authors: Romain Cosentino, Anirvan Sengupta, Salman Avestimehr, Mahdi
Soltanolkotabi, Antonio Ortega, Ted Willke, Mariano Tepper
- Abstract summary: Self-supervised learning (SSL) is one of the premier techniques to create data representations that are actionable for transfer learning in the absence of human annotations.
Mainstream SSL techniques rely on a specific deep neural network architecture with two cascaded neural networks: the encoder and the projector.
In this paper, we investigate how the strength of the data augmentation policies affects the data embedding.
- Score: 55.83778629498769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning (SSL) is currently one of the premier techniques to
create data representations that are actionable for transfer learning in the
absence of human annotations. Despite their success, the underlying geometry of
these representations remains elusive, which obfuscates the quest for more
robust, trustworthy, and interpretable models. In particular, mainstream SSL
techniques rely on a specific deep neural network architecture with two
cascaded neural networks: the encoder and the projector. When used for transfer
learning, the projector is discarded since empirical results show that its
representation generalizes more poorly than the encoder's. In this paper, we
investigate this curious phenomenon and analyze how the strength of the data
augmentation policies affects the data embedding. We discover a non-trivial
relation between the encoder, the projector, and the data augmentation
strength: with increasingly larger augmentation policies, the projector, rather
than the encoder, is more strongly driven to become invariant to the
augmentations. It does so by eliminating crucial information about the data by
learning to project it into a low-dimensional space, a noisy estimate of the
data manifold tangent plane in the encoder representation. This analysis is
substantiated through a geometrical perspective with theoretical and empirical
results.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Understanding Encoder-Decoder Structures in Machine Learning Using Information Measures [10.066310107046084]
We present new results to model and understand the role of encoder-decoder design in machine learning (ML)
We use two main information concepts, information sufficiency (IS) and mutual information loss (MIL), to represent predictive structures in machine learning.
arXiv Detail & Related papers (2024-05-30T19:58:01Z) - Exploring Compressed Image Representation as a Perceptual Proxy: A Study [1.0878040851638]
We propose an end-to-end learned image compression wherein the analysis transform is jointly trained with an object classification task.
This study affirms that the compressed latent representation can predict human perceptual distance judgments with an accuracy comparable to a custom-tailored DNN-based quality metric.
arXiv Detail & Related papers (2024-01-14T04:37:17Z) - Understanding Augmentation-based Self-Supervised Representation Learning
via RKHS Approximation and Regression [53.15502562048627]
Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator.
This work delves into a statistical analysis of augmentation-based pretraining.
arXiv Detail & Related papers (2023-06-01T15:18:55Z) - Augmentation-aware Self-supervised Learning with Conditioned Projector [6.720605329045581]
Self-supervised learning (SSL) is a powerful technique for learning from unlabeled data.
We propose to foster sensitivity to characteristics in the representation space by modifying the projector network.
Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods.
arXiv Detail & Related papers (2023-05-31T12:24:06Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Malicious Network Traffic Detection via Deep Learning: An Information
Theoretic View [0.0]
We study how homeomorphism affects learned representation of a malware traffic dataset.
Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same.
arXiv Detail & Related papers (2020-09-16T15:37:44Z) - IntroVAC: Introspective Variational Classifiers for Learning
Interpretable Latent Subspaces [6.574517227976925]
IntroVAC learns interpretable latent subspaces by exploiting information from an additional label.
We show that IntroVAC is able to learn meaningful directions in the latent space enabling fine manipulation of image attributes.
arXiv Detail & Related papers (2020-08-03T10:21:41Z) - Laplacian Denoising Autoencoder [114.21219514831343]
We propose to learn data representations with a novel type of denoising autoencoder.
The noisy input data is generated by corrupting latent clean data in the gradient domain.
Experiments on several visual benchmarks demonstrate that better representations can be learned with the proposed approach.
arXiv Detail & Related papers (2020-03-30T16:52:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.