An Inclusive Theoretical Framework of Robust Supervised Contrastive Loss against Label Noise
- URL: http://arxiv.org/abs/2501.01130v1
- Date: Thu, 02 Jan 2025 08:12:23 GMT
- Title: An Inclusive Theoretical Framework of Robust Supervised Contrastive Loss against Label Noise
- Authors: Jingyi Cui, Yi-Ge Zhang, Hengyu Liu, Yisen Wang,
- Abstract summary: We propose a unified theoretical framework for robust losses under the pairwise contrastive paradigm.
We derive a general robust condition for arbitrary contrastive losses, which serves as a criterion to verify the theoretical robustness of a supervised contrastive loss against label noise.
We show that our theory is an inclusive framework that provides explanations to prior robust techniques.
- Score: 20.696895070598643
- License:
- Abstract: Learning from noisy labels is a critical challenge in machine learning, with vast implications for numerous real-world scenarios. While supervised contrastive learning has recently emerged as a powerful tool for navigating label noise, many existing solutions remain heuristic, often devoid of a systematic theoretical foundation for crafting robust supervised contrastive losses. To address the gap, in this paper, we propose a unified theoretical framework for robust losses under the pairwise contrastive paradigm. In particular, we for the first time derive a general robust condition for arbitrary contrastive losses, which serves as a criterion to verify the theoretical robustness of a supervised contrastive loss against label noise. The theory indicates that the popular InfoNCE loss is in fact non-robust, and accordingly inspires us to develop a robust version of InfoNCE, termed Symmetric InfoNCE (SymNCE). Moreover, we highlight that our theory is an inclusive framework that provides explanations to prior robust techniques such as nearest-neighbor (NN) sample selection and robust contrastive loss. Validation experiments on benchmark datasets demonstrate the superiority of SymNCE against label noise.
Related papers
- Balancing the Scales: A Theoretical and Algorithmic Framework for Learning from Imbalanced Data [35.03888101803088]
This paper introduces a novel theoretical framework for analyzing generalization in imbalanced classification.
We propose a new class-imbalanced margin loss function for both binary and multi-class settings, prove its strong $H$-consistency, and derive corresponding learning guarantees.
We devise novel and general learning algorithms, IMMAX, which incorporate confidence margins and are applicable to various hypothesis sets.
arXiv Detail & Related papers (2025-02-14T18:57:16Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Expressive Losses for Verified Robustness via Convex Combinations [67.54357965665676]
We study the relationship between the over-approximation coefficient and performance profiles across different expressive losses.
We show that, while expressivity is essential, better approximations of the worst-case loss are not necessarily linked to superior robustness-accuracy trade-offs.
arXiv Detail & Related papers (2023-05-23T12:20:29Z) - Mitigating Memorization of Noisy Labels by Clipping the Model Prediction [43.11056374542014]
Cross Entropy (CE) loss has been shown to be not robust to noisy labels due to its unboundedness.
We propose LogitClip, which clamps the norm of the logit vector to ensure that it is upper bounded by a constant.
arXiv Detail & Related papers (2022-12-08T03:35:42Z) - MaxMatch: Semi-Supervised Learning with Worst-Case Consistency [149.03760479533855]
We propose a worst-case consistency regularization technique for semi-supervised learning (SSL)
We present a generalization bound for SSL consisting of the empirical loss terms observed on labeled and unlabeled training data separately.
Motivated by this bound, we derive an SSL objective that minimizes the largest inconsistency between an original unlabeled sample and its multiple augmented variants.
arXiv Detail & Related papers (2022-09-26T12:04:49Z) - Robust Contrastive Learning against Noisy Views [79.71880076439297]
We propose a new contrastive loss function that is robust against noisy views.
We show that our approach provides consistent improvements over the state-of-the-art image, video, and graph contrastive learning benchmarks.
arXiv Detail & Related papers (2022-01-12T05:24:29Z) - On the Role of Entropy-based Loss for Learning Causal Structures with
Continuous Optimization [27.613220411996025]
A method with non-combinatorial directed acyclic constraint, called NOTEARS, formulates the causal structure learning problem as a continuous optimization problem using least-square loss.
We show that the violation of the Gaussian noise assumption will hinder the causal direction identification.
We propose a more general entropy-based loss that is theoretically consistent with the likelihood score under any noise distribution.
arXiv Detail & Related papers (2021-06-05T08:29:51Z) - Lower-bounded proper losses for weakly supervised classification [73.974163801142]
We discuss the problem of weakly supervised learning of classification, in which instances are given weak labels.
We derive a representation theorem for proper losses in supervised learning, which dualizes the Savage representation.
We experimentally demonstrate the effectiveness of our proposed approach, as compared to improper or unbounded losses.
arXiv Detail & Related papers (2021-03-04T08:47:07Z) - Adversarial Robustness of Supervised Sparse Coding [34.94566482399662]
We consider a model that involves learning a representation while at the same time giving a precise generalization bound and a robustness certificate.
We focus on the hypothesis class obtained by combining a sparsity-promoting encoder coupled with a linear encoder.
We provide a robustness certificate for end-to-end classification.
arXiv Detail & Related papers (2020-10-22T22:05:21Z) - Robust Imitation Learning from Noisy Demonstrations [81.67837507534001]
We show that robust imitation learning can be achieved by optimizing a classification risk with a symmetric loss.
We propose a new imitation learning method that effectively combines pseudo-labeling with co-training.
Experimental results on continuous-control benchmarks show that our method is more robust compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-20T10:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.