Related papers: An Augmentation-Aware Theory for Self-Supervised Contrastive Learning

An Augmentation-Aware Theory for Self-Supervised Contrastive Learning

URL: http://arxiv.org/abs/2505.22196v1
Date: Wed, 28 May 2025 10:18:20 GMT
Title: An Augmentation-Aware Theory for Self-Supervised Contrastive Learning
Authors: Jingyi Cui, Hongwei Wen, Yisen Wang,
Abstract summary: We propose an augmentation-aware error bound for self-supervised contrastive learning.<n>We show that the supervised risk is bounded not only by the unsupervised risk, but also explicitly by a trade-off induced by data augmentation.
Score: 25.01234368914713
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised contrastive learning has emerged as a powerful tool in machine learning and computer vision to learn meaningful representations from unlabeled data. Meanwhile, its empirical success has encouraged many theoretical studies to reveal the learning mechanisms. However, in the existing theoretical research, the role of data augmentation is still under-exploited, especially the effects of specific augmentation types. To fill in the blank, we for the first time propose an augmentation-aware error bound for self-supervised contrastive learning, showing that the supervised risk is bounded not only by the unsupervised risk, but also explicitly by a trade-off induced by data augmentation. Then, under a novel semantic label assumption, we discuss how certain augmentation methods affect the error bound. Lastly, we conduct both pixel- and representation-level experiments to verify our proposed theoretical results.

Related papers

How does Labeling Error Impact Contrastive Learning? A Perspective from Data Dimensionality Reduction [29.43826752911795]
This paper investigates the theoretical impact of labeling error on the downstream classification performance of contrastive learning.<n>To mitigate these impacts, data dimensionality reduction method (e.g., singular value decomposition) is applied on original data to reduce false positive samples.<n>It is also found that SVD acts as a double-edged sword, which may lead to the deterioration of downstream classification accuracy due to the reduced connectivity of the augmentation graph.
arXiv Detail & Related papers (2025-07-15T10:09:55Z)
The Clever Hans Effect in Unsupervised Learning [24.107672144631326]
We show for the first time that Clever Hans effects are widespread in unsupervised learning. Our work sheds light on unexplored risks associated with practical applications of unsupervised learning.
arXiv Detail & Related papers (2024-08-15T09:19:42Z)
Self-Distilled Disentangled Learning for Counterfactual Prediction [49.84163147971955]
We propose the Self-Distilled Disentanglement framework, known as $SD2$. Grounded in information theory, it ensures theoretically sound independent disentangled representations without intricate mutual information estimator designs. Our experiments, conducted on both synthetic and real-world datasets, confirm the effectiveness of our approach.
arXiv Detail & Related papers (2024-06-09T16:58:19Z)
Unveiling the Potential of Probabilistic Embeddings in Self-Supervised Learning [4.124934010794795]
Self-supervised learning has played a pivotal role in advancing machine learning by allowing models to acquire meaningful representations from unlabeled data. We investigate the impact of probabilistic modeling on the information bottleneck, shedding light on a trade-off between compression and preservation of information in both representation and loss space. Our findings suggest that introducing an additional bottleneck in the loss space can significantly enhance the ability to detect out-of-distribution examples.
arXiv Detail & Related papers (2023-10-27T12:01:16Z)
Interpretable Imitation Learning with Dynamic Causal Relations [65.18456572421702]
We propose to expose captured knowledge in the form of a directed acyclic causal graph. We also design this causal discovery process to be state-dependent, enabling it to model the dynamics in latent causal graphs. The proposed framework is composed of three parts: a dynamic causal discovery module, a causality encoding module, and a prediction module, and is trained in an end-to-end manner.
arXiv Detail & Related papers (2023-09-30T20:59:42Z)
Learning Causal Mechanisms through Orthogonal Neural Networks [2.77390041716769]
We investigate a problem of learning, in a fully unsupervised manner, the inverse of a set of independent mechanisms from distorted data points. We propose an unsupervised method that discovers and disentangles a set of independent mechanisms from unlabeled data, and learns how to invert them.
arXiv Detail & Related papers (2023-06-05T13:11:33Z)
Understanding Self-Predictive Learning for Reinforcement Learning [61.62067048348786]
We study the learning dynamics of self-predictive learning for reinforcement learning. We propose a novel self-predictive algorithm that learns two representations simultaneously.
arXiv Detail & Related papers (2022-12-06T20:43:37Z)
Exploring Inconsistent Knowledge Distillation for Object Detection with Data Augmentation [66.25738680429463]
Knowledge Distillation (KD) for object detection aims to train a compact detector by transferring knowledge from a teacher model. We propose inconsistent knowledge distillation (IKD) which aims to distill knowledge inherent in the teacher model's counter-intuitive perceptions. Our method outperforms state-of-the-art KD baselines on one-stage, two-stage and anchor-free object detectors.
arXiv Detail & Related papers (2022-09-20T16:36:28Z)
Generalizable Information Theoretic Causal Representation [37.54158138447033]
We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph. The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
arXiv Detail & Related papers (2022-02-17T00:38:35Z)
The Power of Contrast for Feature Learning: A Theoretical Analysis [42.20116348668721]
We show that contrastive learning outperforms the standard autoencoders and generative adversarial networks. We also illustrate the impact of labeled data in supervised contrastive learning.
arXiv Detail & Related papers (2021-10-06T03:10:28Z)
Adversarial Examples for Unsupervised Machine Learning Models [71.81480647638529]
Adrial examples causing evasive predictions are widely used to evaluate and improve the robustness of machine learning models. We propose a framework of generating adversarial examples for unsupervised models and demonstrate novel applications to data augmentation.
arXiv Detail & Related papers (2021-03-02T17:47:58Z)
A Sober Look at the Unsupervised Learning of Disentangled Representations and their Evaluation [63.042651834453544]
We show that the unsupervised learning of disentangled representations is impossible without inductive biases on both the models and the data. We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision. Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision.
arXiv Detail & Related papers (2020-10-27T10:17:15Z)
Vulnerability Under Adversarial Machine Learning: Bias or Variance? [77.30759061082085]
We investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network. Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation. We introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies.
arXiv Detail & Related papers (2020-08-01T00:58:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.