When hard negative sampling meets supervised contrastive learning
- URL: http://arxiv.org/abs/2308.14893v1
- Date: Mon, 28 Aug 2023 20:30:10 GMT
- Title: When hard negative sampling meets supervised contrastive learning
- Authors: Zijun Long, George Killick, Richard McCreadie, Gerardo Aragon
Camarasa, Zaiqiao Meng
- Abstract summary: We introduce a new supervised contrastive learning objective, SCHaNe, which incorporates hard negative sampling during the fine-tuning phase.
SchaNe outperforms the strong baseline BEiT-3 in Top-1 accuracy across various benchmarks.
Our proposed objective sets a new state-of-the-art for base models on ImageNet-1k, achieving an 86.14% accuracy.
- Score: 17.173114048398947
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: State-of-the-art image models predominantly follow a two-stage strategy:
pre-training on large datasets and fine-tuning with cross-entropy loss. Many
studies have shown that using cross-entropy can result in sub-optimal
generalisation and stability. While the supervised contrastive loss addresses
some limitations of cross-entropy loss by focusing on intra-class similarities
and inter-class differences, it neglects the importance of hard negative
mining. We propose that models will benefit from performance improvement by
weighting negative samples based on their dissimilarity to positive
counterparts. In this paper, we introduce a new supervised contrastive learning
objective, SCHaNe, which incorporates hard negative sampling during the
fine-tuning phase. Without requiring specialized architectures, additional
data, or extra computational resources, experimental results indicate that
SCHaNe outperforms the strong baseline BEiT-3 in Top-1 accuracy across various
benchmarks, with significant gains of up to $3.32\%$ in few-shot learning
settings and $3.41\%$ in full dataset fine-tuning. Importantly, our proposed
objective sets a new state-of-the-art for base models on ImageNet-1k, achieving
an 86.14\% accuracy. Furthermore, we demonstrate that the proposed objective
yields better embeddings and explains the improved effectiveness observed in
our experiments.
Related papers
- CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for Optimized Learning Fusion [15.106479030601378]
Cross-Entropy loss (CE) can compromise model generalization and stability.
We introduce a novel approach named CLCE, which integrates Contrastive Learning with CE.
We show that CLCE significantly outperforms CE in Top-1 accuracy across twelve benchmarks.
arXiv Detail & Related papers (2024-02-22T13:45:01Z) - OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution
Shifts of Individual Nuisances in Natural Images [59.51657161097337]
OOD-CV-v2 is a benchmark dataset that includes out-of-distribution examples of 10 object categories in terms of pose, shape, texture, context and the weather conditions.
In addition to this novel dataset, we contribute extensive experiments using popular baseline methods.
arXiv Detail & Related papers (2023-04-17T20:39:25Z) - Rethinking Prototypical Contrastive Learning through Alignment,
Uniformity and Correlation [24.794022951873156]
We propose to learn Prototypical representation through Alignment, Uniformity and Correlation (PAUC)
Specifically, the ordinary ProtoNCE loss is revised with: (1) an alignment loss that pulls embeddings from positive prototypes together; (2) a loss that distributes the prototypical level features uniformly; (3) a correlation loss that increases the diversity and discriminability between prototypical level features.
arXiv Detail & Related papers (2022-10-18T22:33:12Z) - Siamese Prototypical Contrastive Learning [24.794022951873156]
Contrastive Self-supervised Learning (CSL) is a practical solution that learns meaningful visual representations from massive data in an unsupervised approach.
In this paper, we tackle this problem by introducing a simple but effective contrastive learning framework.
The key insight is to employ siamese-style metric loss to match intra-prototype features, while increasing the distance between inter-prototype features.
arXiv Detail & Related papers (2022-08-18T13:25:30Z) - Guided Point Contrastive Learning for Semi-supervised Point Cloud
Semantic Segmentation [90.2445084743881]
We present a method for semi-supervised point cloud semantic segmentation to adopt unlabeled point clouds in training to boost the model performance.
Inspired by the recent contrastive loss in self-supervised tasks, we propose the guided point contrastive loss to enhance the feature representation and model generalization ability.
arXiv Detail & Related papers (2021-10-15T16:38:54Z) - Are Negative Samples Necessary in Entity Alignment? An Approach with
High Performance, Scalability and Robustness [26.04006507181558]
We propose a novel EA method with three new components to enable high Performance, high Scalability, and high Robustness.
We conduct detailed experiments on several public datasets to examine the effectiveness and efficiency of our proposed method.
arXiv Detail & Related papers (2021-08-11T15:20:41Z) - Improving Contrastive Learning by Visualizing Feature Transformation [37.548120912055595]
In this paper, we attempt to devise a feature-level data manipulation, differing from data augmentation, to enhance the generic contrastive self-supervised learning.
We first design a visualization scheme for pos/neg score (Pos/neg score indicates similarity of pos/neg pair.) distribution, which enables us to analyze, interpret and understand the learning process.
Experiment results show that our proposed Feature Transformation can improve at least 6.0% accuracy on ImageNet-100 over MoCo baseline, and about 2.0% accuracy on ImageNet-1K over the MoCoV2 baseline.
arXiv Detail & Related papers (2021-08-06T07:26:08Z) - Rethinking InfoNCE: How Many Negative Samples Do You Need? [54.146208195806636]
We study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework.
We estimate the optimal negative sampling ratio using the $K$ value that maximizes the training effectiveness function.
arXiv Detail & Related papers (2021-05-27T08:38:29Z) - Solving Inefficiency of Self-supervised Representation Learning [87.30876679780532]
Existing contrastive learning methods suffer from very low learning efficiency.
Under-clustering and over-clustering problems are major obstacles to learning efficiency.
We propose a novel self-supervised learning framework using a median triplet loss.
arXiv Detail & Related papers (2021-04-18T07:47:10Z) - Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views.
Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs.
For the class view, we build the positive and negative pairs from the sample distribution of the class.
In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z) - Unleashing the Power of Contrastive Self-Supervised Visual Models via
Contrast-Regularized Fine-Tuning [94.35586521144117]
We investigate whether applying contrastive learning to fine-tuning would bring further benefits.
We propose Contrast-regularized tuning (Core-tuning), a novel approach for fine-tuning contrastive self-supervised visual models.
arXiv Detail & Related papers (2021-02-12T16:31:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.