Generalization Bounds for Robust Contrastive Learning: From Theory to Practice
- URL: http://arxiv.org/abs/2311.09671v2
- Date: Mon, 27 Oct 2025 10:48:43 GMT
- Title: Generalization Bounds for Robust Contrastive Learning: From Theory to Practice
- Authors: Ngoc N. Tran, Lam Tran, Hoang Phan, Anh Bui, Tung Pham, Toan Tran, Dinh Phung, Trung Le,
- Abstract summary: We develop theories to identify which components in the unsupervised training can help improve the robust supervised loss.<n>Besides the adversarial contrastive loss, we reveal that the benign one, along with a global divergence between benign and adversarial examples can also improve robustness.
- Score: 20.805320268190936
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Contrastive Learning first extracts features from unlabeled data, followed by linear probing with labeled data. Adversarial Contrastive Learning (ACL) integrates Adversarial Training into the first phase to enhance feature robustness against attacks in the probing phase. While ACL has shown strong empirical results, its theoretical understanding remains limited. Furthermore, while a fair amount of theoretical works analyze how the unsupervised loss can support the supervised loss in the probing phase, none has examined its role to the robust supervised loss. To fill this gap, our work develops rigorous theories to identify which components in the unsupervised training can help improve the robust supervised loss. Specifically, besides the adversarial contrastive loss, we reveal that the benign one, along with a global divergence between benign and adversarial examples can also improve robustness. Proper experiments are conducted to justify our findings.
Related papers
- Test-Time Learning of Causal Structure from Interventional Data [50.06913286558919]
We propose TICL (Test-time Interventional Causal Learning), a novel method that synergizes Test-Time Training with Joint Causal Inference.<n>Specifically, we design a self-augmentation strategy to generate instance-specific training data at test time, effectively avoiding distribution shifts.<n>By integrating joint causal inference, we developed a PC-inspired two-phase supervised learning scheme, which effectively leverages self-augmented training data while ensuring theoretical identifiability.
arXiv Detail & Related papers (2026-02-22T11:23:05Z) - A Unified and Stable Risk Minimization Framework for Weakly Supervised Learning with Theoretical Guarantees [33.15955234458642]
Weakly supervised learning has emerged as a practical alternative to fully supervised learning when complete and accurate labels are costly or infeasible to acquire.<n>We propose a principled, unified framework that bypasses such post-hoc adjustments by formulating a stable surrogate risk grounded in the structure of weakly supervised data.
arXiv Detail & Related papers (2025-11-28T00:57:04Z) - Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head [38.898038672237746]
We introduce a logit-level loss function as a supplement to the widely used probability-level loss function.
We find that the amalgamation of the newly introduced logit-level loss and the previous probability-level loss will lead to performance degeneration.
We propose a novel method called dual-head knowledge distillation, which partitions the linear classifier into two classification heads responsible for different losses.
arXiv Detail & Related papers (2024-11-13T12:33:04Z) - SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract.
We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z) - What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights [67.72413262980272]
Severe data imbalance naturally exists among web-scale vision-language datasets.
We find CLIP pre-trained thereupon exhibits notable robustness to the data imbalance compared to supervised learning.
The robustness and discriminability of CLIP improve with more descriptive language supervision, larger data scale, and broader open-world concepts.
arXiv Detail & Related papers (2024-05-31T17:57:24Z) - Relaxed Contrastive Learning for Federated Learning [48.96253206661268]
We propose a novel contrastive learning framework to address the challenges of data heterogeneity in federated learning.
Our framework outperforms all existing federated learning approaches by huge margins on the standard benchmarks.
arXiv Detail & Related papers (2024-01-10T04:55:24Z) - Semi-Supervised End-To-End Contrastive Learning For Time Series
Classification [10.635321868623883]
Time series classification is a critical task in various domains, such as finance, healthcare, and sensor data analysis.
We propose an end-to-end model called SLOTS (Semi-supervised Learning fOr Time clasSification)
arXiv Detail & Related papers (2023-10-13T04:22:21Z) - Rethinking Weak Supervision in Helping Contrastive Learning [19.5649824524209]
We explore the mechanical differences between semi-supervised and noisy-labeled information in helping contrastive learning.
Specifically, we investigate the most intuitive paradigm of jointly training supervised and unsupervised contrastive losses.
We prove that semi-supervised labels improve the downstream error bound whereas noisy labels have limited effects under such a paradigm.
arXiv Detail & Related papers (2023-06-07T05:18:27Z) - Robustness of Unsupervised Representation Learning without Labels [92.90480374344777]
We propose a family of unsupervised robustness measures, which are model- and task-agnostic and label-free.
We validate our results against a linear probe and show that, for MOCOv2, adversarial training results in 3 times higher certified accuracy.
arXiv Detail & Related papers (2022-10-08T18:03:28Z) - Semi-Supervised Learning with Mutual Distillation for Monocular Depth
Estimation [27.782150368174413]
We build two separate network branches for each loss and distilling each other through the mutual distillation loss function.
We conduct experiments to demonstrate the effectiveness of our framework over the latest methods and provide extensive ablation studies.
arXiv Detail & Related papers (2022-03-18T04:28:58Z) - Adversarial Dual-Student with Differentiable Spatial Warping for
Semi-Supervised Semantic Segmentation [70.2166826794421]
We propose a differentiable geometric warping to conduct unsupervised data augmentation.
We also propose a novel adversarial dual-student framework to improve the Mean-Teacher.
Our solution significantly improves the performance and state-of-the-art results are achieved on both datasets.
arXiv Detail & Related papers (2022-03-05T17:36:17Z) - The Power of Contrast for Feature Learning: A Theoretical Analysis [42.20116348668721]
We show that contrastive learning outperforms the standard autoencoders and generative adversarial networks.
We also illustrate the impact of labeled data in supervised contrastive learning.
arXiv Detail & Related papers (2021-10-06T03:10:28Z) - Leveraged Weighted Loss for Partial Label Learning [64.85763991485652]
Partial label learning deals with data where each instance is assigned with a set of candidate labels, whereas only one of them is true.
Despite many methodology studies on learning from partial labels, there still lacks theoretical understandings of their risk consistent properties.
We propose a family of loss functions named textitd weighted (LW) loss, which for the first time introduces the leverage parameter $beta$ to consider the trade-off between losses on partial labels and non-partial ones.
arXiv Detail & Related papers (2021-06-10T13:25:13Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - Curse or Redemption? How Data Heterogeneity Affects the Robustness of
Federated Learning [51.15273664903583]
Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks.
This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks.
arXiv Detail & Related papers (2021-02-01T06:06:21Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - A Commentary on the Unsupervised Learning of Disentangled
Representations [63.042651834453544]
The goal of the unsupervised learning of disentangled representations is to separate the independent explanatory factors of variation in the data without access to supervision.
We discuss the theoretical result showing that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases.
arXiv Detail & Related papers (2020-07-28T13:13:45Z) - Structured Consistency Loss for semi-supervised semantic segmentation [1.4146420810689415]
The consistency loss has played a key role in solving problems in recent studies on semi-supervised learning.
We propose a structured consistency loss to address this limitation of extant studies.
We are the first to present the superiority of state-of-the-art semi-supervised learning in semantic segmentation.
arXiv Detail & Related papers (2020-01-14T07:08:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.