Variational Supervised Contrastive Learning
- URL: http://arxiv.org/abs/2506.07413v2
- Date: Thu, 26 Jun 2025 12:27:25 GMT
- Title: Variational Supervised Contrastive Learning
- Authors: Ziwen Wang, Jiajun Fan, Thao Nguyen, Heng Ji, Ge Liu,
- Abstract summary: We propose Variational Supervised Contrastive Learning (VarCon), which reformulates supervised contrastive learning as variational inference over latent class variables.<n>VarCon achieves state-of-the-art performance for contrastive learning frameworks, reaching 79.36% Top-1 accuracy on ImageNet-1K and 78.29% on CIFAR-100 with a ResNet-50 encoder.
- Score: 50.79938854370321
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contrastive learning has proven to be highly efficient and adaptable in shaping representation spaces across diverse modalities by pulling similar samples together and pushing dissimilar ones apart. However, two key limitations persist: (1) Without explicit regulation of the embedding distribution, semantically related instances can inadvertently be pushed apart unless complementary signals guide pair selection, and (2) excessive reliance on large in-batch negatives and tailored augmentations hinders generalization. To address these limitations, we propose Variational Supervised Contrastive Learning (VarCon), which reformulates supervised contrastive learning as variational inference over latent class variables and maximizes a posterior-weighted evidence lower bound (ELBO) that replaces exhaustive pair-wise comparisons for efficient class-aware matching and grants fine-grained control over intra-class dispersion in the embedding space. Trained exclusively on image data, our experiments on CIFAR-10, CIFAR-100, ImageNet-100, and ImageNet-1K show that VarCon (1) achieves state-of-the-art performance for contrastive learning frameworks, reaching 79.36% Top-1 accuracy on ImageNet-1K and 78.29% on CIFAR-100 with a ResNet-50 encoder while converging in just 200 epochs; (2) yields substantially clearer decision boundaries and semantic organization in the embedding space, as evidenced by KNN classification, hierarchical clustering results, and transfer-learning assessments; and (3) demonstrates superior performance in few-shot learning than supervised baseline and superior robustness across various augmentation strategies.
Related papers
- Fine-Grained Representation Learning via Multi-Level Contrastive Learning without Class Priors [3.050634053489509]
Contrastive Disentangling (CD) is a framework designed to learn representations without relying on class priors.
CD integrates instance-level and feature-level contrastive losses with a normalized entropy loss to capture semantically rich and fine-grained representations.
arXiv Detail & Related papers (2024-09-07T16:39:14Z) - ContraCluster: Learning to Classify without Labels by Contrastive
Self-Supervision and Prototype-Based Semi-Supervision [7.819942809508631]
We propose ContraCluster, an unsupervised image classification method that combines clustering with the power of contrastive self-supervised learning.
ContraCluster consists of three stages: (1) contrastive self-supervised pre-training (CPT), (2) contrastive prototype sampling (CPS), and (3) prototype-based semi-supervised fine-tuning (PB-SFT).
We demonstrate empirically that ContraCluster achieves new state-of-the-art results for standard benchmark datasets including CIFAR-10, STL-10, and ImageNet-10.
arXiv Detail & Related papers (2023-04-19T01:51:08Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Rethinking Prototypical Contrastive Learning through Alignment,
Uniformity and Correlation [24.794022951873156]
We propose to learn Prototypical representation through Alignment, Uniformity and Correlation (PAUC)
Specifically, the ordinary ProtoNCE loss is revised with: (1) an alignment loss that pulls embeddings from positive prototypes together; (2) a loss that distributes the prototypical level features uniformly; (3) a correlation loss that increases the diversity and discriminability between prototypical level features.
arXiv Detail & Related papers (2022-10-18T22:33:12Z) - Generalized Supervised Contrastive Learning [3.499054375230853]
We introduce a generalized supervised contrastive loss, which measures cross-entropy between label similarity and latent similarity.
Compared to existing contrastive learning frameworks, we construct a tailored framework: the Generalized Supervised Contrastive Learning (GenSCL)
GenSCL achieves a top-1 accuracy of 77.3% on ImageNet, a 4.1% improvement over traditional supervised contrastive learning.
arXiv Detail & Related papers (2022-06-01T10:38:21Z) - Weakly Supervised Contrastive Learning [68.47096022526927]
We introduce a weakly supervised contrastive learning framework (WCL) to tackle this issue.
WCL achieves 65% and 72% ImageNet Top-1 Accuracy using ResNet50, which is even higher than SimCLRv2 with ResNet101.
arXiv Detail & Related papers (2021-10-10T12:03:52Z) - Semi-supervised Contrastive Learning with Similarity Co-calibration [72.38187308270135]
We propose a novel training strategy, termed as Semi-supervised Contrastive Learning (SsCL)
SsCL combines the well-known contrastive loss in self-supervised learning with the cross entropy loss in semi-supervised learning.
We show that SsCL produces more discriminative representation and is beneficial to few shot learning.
arXiv Detail & Related papers (2021-05-16T09:13:56Z) - Unsupervised Representation Learning by InvariancePropagation [34.53866045440319]
In this paper, we propose Invariance propagation to focus on learning representations invariant to category-level variations.
With a ResNet-50 as the backbone, our method achieves 71.3% top-1 accuracy on ImageNet linear classification and 78.2% top-5 accuracy fine-tuning on only 1% labels.
We also achieve state-of-the-art performance on other downstream tasks, including linear classification on Places205 and Pascal VOC, and transfer learning on small scale datasets.
arXiv Detail & Related papers (2020-10-07T13:00:33Z) - Hybrid Discriminative-Generative Training via Contrastive Learning [96.56164427726203]
We show that through the perspective of hybrid discriminative-generative training of energy-based models we can make a direct connection between contrastive learning and supervised learning.
We show our specific choice of approximation of the energy-based loss outperforms the existing practice in terms of classification accuracy of WideResNet on CIFAR-10 and CIFAR-100.
arXiv Detail & Related papers (2020-07-17T15:50:34Z) - Generalized Zero-Shot Learning Via Over-Complete Distribution [79.5140590952889]
We propose to generate an Over-Complete Distribution (OCD) using Conditional Variational Autoencoder (CVAE) of both seen and unseen classes.
The effectiveness of the framework is evaluated using both Zero-Shot Learning and Generalized Zero-Shot Learning protocols.
arXiv Detail & Related papers (2020-04-01T19:05:28Z) - A Simple Framework for Contrastive Learning of Visual Representations [116.37752766922407]
This paper presents SimCLR: a simple framework for contrastive learning of visual representations.
We show that composition of data augmentations plays a critical role in defining effective predictive tasks.
We are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet.
arXiv Detail & Related papers (2020-02-13T18:50:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.