Bayesian Self-Supervised Contrastive Learning
- URL: http://arxiv.org/abs/2301.11673v4
- Date: Wed, 31 Jan 2024 03:02:06 GMT
- Title: Bayesian Self-Supervised Contrastive Learning
- Authors: Bin Liu, Bang Wang, Tianrui Li
- Abstract summary: This paper proposes a new self-supervised contrastive loss called the BCL loss.
The key idea is to design the desired sampling distribution for sampling hard true negative samples under the Bayesian framework.
Experiments validate the effectiveness and superiority of the BCL loss.
- Score: 16.903874675729952
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent years have witnessed many successful applications of contrastive
learning in diverse domains, yet its self-supervised version still remains many
exciting challenges. As the negative samples are drawn from unlabeled datasets,
a randomly selected sample may be actually a false negative to an anchor,
leading to incorrect encoder training. This paper proposes a new
self-supervised contrastive loss called the BCL loss that still uses random
samples from the unlabeled data while correcting the resulting bias with
importance weights. The key idea is to design the desired sampling distribution
for sampling hard true negative samples under the Bayesian framework. The
prominent advantage lies in that the desired sampling distribution is a
parametric structure, with a location parameter for debiasing false negative
and concentration parameter for mining hard negative, respectively. Experiments
validate the effectiveness and superiority of the BCL loss.
Related papers
- Learning with Imbalanced Noisy Data by Preventing Bias in Sample
Selection [82.43311784594384]
Real-world datasets contain not only noisy labels but also class imbalance.
We propose a simple yet effective method to address noisy labels in imbalanced datasets.
arXiv Detail & Related papers (2024-02-17T10:34:53Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - CLAF: Contrastive Learning with Augmented Features for Imbalanced
Semi-Supervised Learning [40.5117833362268]
Semi-supervised learning and contrastive learning have been progressively combined to achieve better performances in popular applications.
One common manner is assigning pseudo-labels to unlabeled samples and selecting positive and negative samples from pseudo-labeled samples to apply contrastive learning.
We propose Contrastive Learning with Augmented Features (CLAF) to alleviate the scarcity of minority class samples in contrastive learning.
arXiv Detail & Related papers (2023-12-15T08:27:52Z) - Your Negative May not Be True Negative: Boosting Image-Text Matching
with False Negative Elimination [62.18768931714238]
We propose a novel False Negative Elimination (FNE) strategy to select negatives via sampling.
The results demonstrate the superiority of our proposed false negative elimination strategy.
arXiv Detail & Related papers (2023-08-08T16:31:43Z) - Learning with Noisy Labels over Imbalanced Subpopulations [13.477553187049462]
Learning with noisy labels (LNL) has attracted significant attention from the research community.
We propose a novel LNL method to simultaneously deal with noisy labels and imbalanced subpopulations.
We introduce a feature-based metric that takes the sample correlation into account for estimating samples' clean probabilities.
arXiv Detail & Related papers (2022-11-16T07:25:24Z) - Rethinking Collaborative Metric Learning: Toward an Efficient
Alternative without Negative Sampling [156.7248383178991]
Collaborative Metric Learning (CML) paradigm has aroused wide interest in the area of recommendation systems (RS)
We find that negative sampling would lead to a biased estimation of the generalization error.
Motivated by this, we propose an efficient alternative without negative sampling for CML named textitSampling-Free Collaborative Metric Learning (SFCML)
arXiv Detail & Related papers (2022-06-23T08:50:22Z) - Exploring the Impact of Negative Samples of Contrastive Learning: A Case
Study of Sentence Embedding [14.295787044482136]
We present a momentum contrastive learning model with negative sample queue for sentence embedding, namely MoCoSE.
We define a maximum traceable distance metric, through which we learn to what extent the text contrastive learning benefits from the historical information of negative samples.
Our experiments find that the best results are obtained when the maximum traceable distance is at a certain range, demonstrating that there is an optimal range of historical information for a negative sample queue.
arXiv Detail & Related papers (2022-02-26T08:29:25Z) - Uncertainty-aware Pseudo-label Selection for Positive-Unlabeled Learning [10.014356492742074]
We propose to tackle the issues of imbalanced datasets and model calibration in a positive-unlabeled learning setting.
By boosting the signal from the minority class, pseudo-labeling expands the labeled dataset with new samples from the unlabeled set.
Within a series of experiments, PUUPL yields substantial performance gains in highly imbalanced settings.
arXiv Detail & Related papers (2022-01-31T12:55:47Z) - Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework.
We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z) - Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views.
Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs.
For the class view, we build the positive and negative pairs from the sample distribution of the class.
In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z) - Sampler Design for Implicit Feedback Data by Noisy-label Robust Learning [32.76804332450971]
We design an adaptive sampler based on noisy-label robust learning for implicit feedback data.
We predict users' preferences with the model and learn it by maximizing likelihood of observed data labels.
We then consider the risk of these noisy labels, and propose a Noisy-label Robust BPO.
arXiv Detail & Related papers (2020-06-28T05:31:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.