Related papers: Semantically-Conditioned Negative Samples for Efficient Contrastive Learning

Semantically-Conditioned Negative Samples for Efficient Contrastive Learning

URL: http://arxiv.org/abs/2102.06603v1
Date: Fri, 12 Feb 2021 16:26:52 GMT
Title: Semantically-Conditioned Negative Samples for Efficient Contrastive Learning
Authors: James O' Neill, Danushka Bollegala
Abstract summary: Negative sampling provides little information about the class boundaries. We propose three novel techniques for efficient negative sampling. Our experiments on CIFAR-10, CIFAR-100 and Tiny-ImageNet-200 show consistent performance improvements.
Score: 22.631763991832862
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Negative sampling is a limiting factor w.r.t. the generalization of metric-learned neural networks. We show that uniform negative sampling provides little information about the class boundaries and thus propose three novel techniques for efficient negative sampling: drawing negative samples from (1) the top-$k$ most semantically similar classes, (2) the top-$k$ most semantically similar samples and (3) interpolating between contrastive latent representations to create pseudo negatives. Our experiments on CIFAR-10, CIFAR-100 and Tiny-ImageNet-200 show that our proposed \textit{Semantically Conditioned Negative Sampling} and Latent Mixup lead to consistent performance improvements. In the standard supervised learning setting, on average we increase test accuracy by 1.52\% percentage points on CIFAR-10 across various network architectures. In the knowledge distillation setting, (1) the performance of student networks increase by 4.56\% percentage points on Tiny-ImageNet-200 and 3.29\% on CIFAR-100 over student networks trained with no teacher and (2) 1.23\% and 1.72\% respectively over a \textit{hard-to-beat} baseline (Hinton et al., 2015).

Related papers

Variational Supervised Contrastive Learning [50.79938854370321]
We propose Variational Supervised Contrastive Learning (VarCon), which reformulates supervised contrastive learning as variational inference over latent class variables.<n>VarCon achieves state-of-the-art performance for contrastive learning frameworks, reaching 79.36% Top-1 accuracy on ImageNet-1K and 78.29% on CIFAR-100 with a ResNet-50 encoder.
arXiv Detail & Related papers (2025-06-09T04:19:12Z)
$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs [62.565573316667276]
We develop an objective that encodes how a sample relates to others. We train vision models based on similarities in class or text caption descriptions. Our objective appears to work particularly well in lower-data regimes, with gains over CLIP of $16.8%$ on ImageNet and $18.1%$ on ImageNet Real.
arXiv Detail & Related papers (2024-07-25T15:38:16Z)
When hard negative sampling meets supervised contrastive learning [17.173114048398947]
We introduce a new supervised contrastive learning objective, SCHaNe, which incorporates hard negative sampling during the fine-tuning phase. SchaNe outperforms the strong baseline BEiT-3 in Top-1 accuracy across various benchmarks. Our proposed objective sets a new state-of-the-art for base models on ImageNet-1k, achieving an 86.14% accuracy.
arXiv Detail & Related papers (2023-08-28T20:30:10Z)
Your Negative May not Be True Negative: Boosting Image-Text Matching with False Negative Elimination [62.18768931714238]
We propose a novel False Negative Elimination (FNE) strategy to select negatives via sampling. The results demonstrate the superiority of our proposed false negative elimination strategy.
arXiv Detail & Related papers (2023-08-08T16:31:43Z)
Do More Negative Samples Necessarily Hurt in Contrastive Learning? [25.234544066205547]
We show in a simple theoretical setting, where positive pairs are generated by sampling from the underlying latent class, that the downstream performance of the representation does not degrade with the number of negative samples. We also give a structural characterization of the optimal representation in our framework.
arXiv Detail & Related papers (2022-05-03T21:29:59Z)
CaCo: Both Positive and Negative Samples are Directly Learnable via Cooperative-adversarial Contrastive Learning [45.68097757313092]
We train an encoder by distinguishing positive samples from negative ones given query anchors. We show that the positive and negative samples can be cooperatively and adversarially learned by minimizing and maximizing the contrastive loss. The proposed method achieves 71.3% and 75.3% in top-1 accuracy respectively over 200 and 800 epochs of pre-training ResNet-50 backbone on ImageNet1K.
arXiv Detail & Related papers (2022-03-27T18:50:39Z)
MIO : Mutual Information Optimization using Self-Supervised Binary Contrastive Learning [12.365801596593936]
We model our pre-training task as a binary classification problem to induce an implicit contrastive effect. Unlike existing methods, the proposed loss function optimize the mutual information in positive and negative pairs. The proposed method outperforms SOTA self-supervised contrastive frameworks on benchmark datasets.
arXiv Detail & Related papers (2021-11-24T17:51:29Z)
Weakly Supervised Contrastive Learning [68.47096022526927]
We introduce a weakly supervised contrastive learning framework (WCL) to tackle this issue. WCL achieves 65% and 72% ImageNet Top-1 Accuracy using ResNet50, which is even higher than SimCLRv2 with ResNet101.
arXiv Detail & Related papers (2021-10-10T12:03:52Z)
Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework. We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z)
With a Little Help from My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations [87.72779294717267]
Using the nearest-neighbor as positive in contrastive losses improves performance significantly on ImageNet classification. We demonstrate empirically that our method is less reliant on complex data augmentations.
arXiv Detail & Related papers (2021-04-29T17:56:08Z)
Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views. Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs. For the class view, we build the positive and negative pairs from the sample distribution of the class. In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z)
SCE: Scalable Network Embedding from Sparsest Cut [20.08464038805681]
Large-scale network embedding is to learn a latent representation for each node in an unsupervised manner. A key of success to such contrastive learning methods is how to draw positive and negative samples. In this paper, we propose SCE for unsupervised network embedding only using negative samples for training.
arXiv Detail & Related papers (2020-06-30T03:18:15Z)
SCAN: Learning to Classify Images without Labels [73.69513783788622]
We advocate a two-step approach where feature learning and clustering are decoupled. A self-supervised task from representation learning is employed to obtain semantically meaningful features. We obtain promising results on ImageNet, and outperform several semi-supervised learning methods in the low-data regime.
arXiv Detail & Related papers (2020-05-25T18:12:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.