Related papers: A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets

A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets

URL: http://arxiv.org/abs/2503.17024v1
Date: Fri, 21 Mar 2025 10:34:51 GMT
Title: A Tale of Two Classes: Adapting Supervised Contrastive Learning to Binary Imbalanced Datasets
Authors: David Mildenberger, Paul Hager, Daniel Rueckert, Martin J Menten,
Abstract summary: Supervised contrastive learning (SupCon) has proven to be a powerful alternative to the standard cross-entropy loss for classification of balanced datasets.<n>We show that SupCon's performance decreases with increasing class imbalance.<n>We propose two new supervised contrastive learning strategies tailored to binary imbalanced datasets.
Score: 9.413178499853156
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Supervised contrastive learning (SupCon) has proven to be a powerful alternative to the standard cross-entropy loss for classification of multi-class balanced datasets. However, it struggles to learn well-conditioned representations of datasets with long-tailed class distributions. This problem is potentially exacerbated for binary imbalanced distributions, which are commonly encountered during many real-world problems such as medical diagnosis. In experiments on seven binary datasets of natural and medical images, we show that the performance of SupCon decreases with increasing class imbalance. To substantiate these findings, we introduce two novel metrics that evaluate the quality of the learned representation space. By measuring the class distribution in local neighborhoods, we are able to uncover structural deficiencies of the representation space that classical metrics cannot detect. Informed by these insights, we propose two new supervised contrastive learning strategies tailored to binary imbalanced datasets that improve the structure of the representation space and increase downstream classification accuracy over standard SupCon by up to 35%. We make our code available.

Related papers

CORAL: Disentangling Latent Representations in Long-Tailed Diffusion [4.310167974376405]
We investigate the behavior of diffusion models trained on long-tailed datasets.<n>Latent representations for tail class subspaces exhibit significant overlap with those of head classes.<n>We propose a contrastive latent alignment framework that leverages supervised contrastive losses to encourage well-separated latent class representations.
arXiv Detail & Related papers (2025-06-19T00:23:44Z)
Uncertainty-guided Boundary Learning for Imbalanced Social Event Detection [64.4350027428928]
We propose a novel uncertainty-guided class imbalance learning framework for imbalanced social event detection tasks. Our model significantly improves social event representation and classification tasks in almost all classes, especially those uncertain ones.
arXiv Detail & Related papers (2023-10-30T03:32:04Z)
Class-Imbalanced Graph Learning without Class Rebalancing [62.1368829847041]
Class imbalance is prevalent in real-world node classification tasks and poses great challenges for graph learning models. In this work, we approach the root cause of class-imbalance bias from an topological paradigm. We devise a lightweight topological augmentation framework BAT to mitigate the class-imbalance bias without class rebalancing.
arXiv Detail & Related papers (2023-08-27T19:01:29Z)
Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants [166.916517335816]
In this paper, we offer a unified solution to the misalignment dilemma in the three tasks. We propose neural collapse terminus that is a fixed structure with the maximal equiangular inter-class separation for the whole label space. Our method holds the neural collapse optimality in an incremental fashion regardless of data imbalance or data scarcity.
arXiv Detail & Related papers (2023-08-03T13:09:59Z)
Adjusting Logit in Gaussian Form for Long-Tailed Visual Recognition [37.62659619941791]
We study the problem of long-tailed visual recognition from the perspective of feature level. Two novel logit adjustment methods are proposed to improve model performance at a modest computational overhead. Experiments conducted on benchmark datasets demonstrate the superior performance of the proposed method over the state-of-the-art ones.
arXiv Detail & Related papers (2023-05-18T02:06:06Z)
Inducing Neural Collapse in Deep Long-tailed Learning [13.242721780822848]
We propose two explicit feature regularization terms to learn high-quality representation for class-imbalanced data. With the proposed regularization, Neural Collapse phenomena will appear under the class-imbalanced distribution. Our method is easily implemented, highly effective, and can be plugged into most existing methods.
arXiv Detail & Related papers (2023-02-24T05:07:05Z)
Constructing Balance from Imbalance for Long-tailed Image Recognition [50.6210415377178]
The imbalance between majority (head) classes and minority (tail) classes severely skews the data-driven deep neural networks. Previous methods tackle with data imbalance from the viewpoints of data distribution, feature space, and model design. We propose a concise paradigm by progressively adjusting label space and dividing the head classes and tail classes. Our proposed model also provides a feature evaluation method and paves the way for long-tailed feature learning.
arXiv Detail & Related papers (2022-08-04T10:22:24Z)
A Reduction to Binary Approach for Debiasing Multiclass Datasets [12.885756277367443]
We prove that R2B satisfies optimality and bias guarantees and demonstrate empirically that it can lead to an improvement over two baselines. We validate these conclusions on synthetic and real-world datasets from social science, computer vision, and healthcare.
arXiv Detail & Related papers (2022-05-31T15:11:41Z)
Neighborhood Contrastive Learning for Novel Class Discovery [79.14767688903028]
We build a new framework, named Neighborhood Contrastive Learning, to learn discriminative representations that are important to clustering performance. We experimentally demonstrate that these two ingredients significantly contribute to clustering performance and lead our model to outperform state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-06-20T17:34:55Z)
Semi-supervised Long-tailed Recognition using Alternate Sampling [95.93760490301395]
Main challenges in long-tailed recognition come from the imbalanced data distribution and sample scarcity in its tail classes. We propose a new recognition setting, namely semi-supervised long-tailed recognition. We demonstrate significant accuracy improvements over other competitive methods on two datasets.
arXiv Detail & Related papers (2021-05-01T00:43:38Z)
Analyzing Overfitting under Class Imbalance in Neural Networks for Image Segmentation [19.259574003403998]
In image segmentation neural networks may overfit to the foreground samples from small structures. In this study, we provide new insights on the problem of overfitting under class imbalance by inspecting the network behavior.
arXiv Detail & Related papers (2021-02-20T14:57:58Z)
Long-Tailed Recognition Using Class-Balanced Experts [128.73438243408393]
We propose an ensemble of class-balanced experts that combines the strength of diverse classifiers. Our ensemble of class-balanced experts reaches results close to state-of-the-art and an extended ensemble establishes a new state-of-the-art on two benchmarks for long-tailed recognition.
arXiv Detail & Related papers (2020-04-07T20:57:44Z)
Imbalanced Data Learning by Minority Class Augmentation using Capsule Adversarial Networks [31.073558420480964]
We propose a method to restore the balance in imbalanced images, by coalescing two concurrent methods. In our model, generative and discriminative networks play a novel competitive game. The coalescing of capsule-GAN is effective at recognizing highly overlapping classes with much fewer parameters compared with the convolutional-GAN.
arXiv Detail & Related papers (2020-04-05T12:36:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.