CORAL: Disentangling Latent Representations in Long-Tailed Diffusion
- URL: http://arxiv.org/abs/2506.15933v1
- Date: Thu, 19 Jun 2025 00:23:44 GMT
- Title: CORAL: Disentangling Latent Representations in Long-Tailed Diffusion
- Authors: Esther Rodriguez, Monica Welfert, Samuel McDowell, Nathan Stromberg, Julian Antolin Camarena, Lalitha Sankar,
- Abstract summary: We investigate the behavior of diffusion models trained on long-tailed datasets.<n>Latent representations for tail class subspaces exhibit significant overlap with those of head classes.<n>We propose a contrastive latent alignment framework that leverages supervised contrastive losses to encourage well-separated latent class representations.
- Score: 4.310167974376405
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models have achieved impressive performance in generating high-quality and diverse synthetic data. However, their success typically assumes a class-balanced training distribution. In real-world settings, multi-class data often follow a long-tailed distribution, where standard diffusion models struggle -- producing low-diversity and lower-quality samples for tail classes. While this degradation is well-documented, its underlying cause remains poorly understood. In this work, we investigate the behavior of diffusion models trained on long-tailed datasets and identify a key issue: the latent representations (from the bottleneck layer of the U-Net) for tail class subspaces exhibit significant overlap with those of head classes, leading to feature borrowing and poor generation quality. Importantly, we show that this is not merely due to limited data per class, but that the relative class imbalance significantly contributes to this phenomenon. To address this, we propose COntrastive Regularization for Aligning Latents (CORAL), a contrastive latent alignment framework that leverages supervised contrastive losses to encourage well-separated latent class representations. Experiments demonstrate that CORAL significantly improves both the diversity and visual quality of samples generated for tail classes relative to state-of-the-art methods.
Related papers
- ViRN: Variational Inference and Distribution Trilateration for Long-Tailed Continual Representation Learning [6.253882111488726]
ViRN is a novel framework that integrates variational inference with distributional trilateration for robust long-tailed learning.<n> evaluated on six long-tailed classification benchmarks, including speech (e.g., rare acoustic events, accents) and image tasks.<n> achieves a 10.24% average accuracy gain over state-of-the-art methods.
arXiv Detail & Related papers (2025-07-23T10:04:30Z) - Training Class-Imbalanced Diffusion Model Via Overlap Optimization [55.96820607533968]
Diffusion models trained on real-world datasets often yield inferior fidelity for tail classes.
Deep generative models, including diffusion models, are biased towards classes with abundant training images.
We propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes.
arXiv Detail & Related papers (2024-02-16T16:47:21Z) - Fair GANs through model rebalancing for extremely imbalanced class
distributions [5.463417677777276]
We present an approach to construct an unbiased generative adversarial network (GAN) from an existing biased GAN.
We show results for the StyleGAN2 models while training on the Flickr Faces High Quality (FFHQ) dataset for racial fairness.
We further validate our approach by applying it to an imbalanced CIFAR10 dataset which is also twice as large.
arXiv Detail & Related papers (2023-08-16T19:20:06Z) - Concept Drift and Long-Tailed Distribution in Fine-Grained Visual Categorization: Benchmark and Method [84.68818879525568]
We present a Concept Drift and Long-Tailed Distribution dataset.
The characteristics of instances tend to vary with time and exhibit a long-tailed distribution.
We propose a feature recombination framework to address the learning challenges associated with CDLT.
arXiv Detail & Related papers (2023-06-04T12:42:45Z) - Class-Balancing Diffusion Models [57.38599989220613]
Class-Balancing Diffusion Models (CBDM) are trained with a distribution adjustment regularizer as a solution.
Our method benchmarked the generation results on CIFAR100/CIFAR100LT dataset and shows outstanding performance on the downstream recognition task.
arXiv Detail & Related papers (2023-04-30T20:00:14Z) - Improving GANs for Long-Tailed Data through Group Spectral
Regularization [51.58250647277375]
We propose a novel group Spectral Regularizer (gSR) that prevents the spectral explosion alleviating mode collapse.
We find that gSR effectively combines with existing augmentation and regularization techniques, leading to state-of-the-art image generation performance on long-tailed data.
arXiv Detail & Related papers (2022-08-21T17:51:05Z) - Targeted Supervised Contrastive Learning for Long-Tailed Recognition [50.24044608432207]
Real-world data often exhibits long tail distributions with heavy class imbalance.
We show that while supervised contrastive learning can help improve performance, past baselines suffer from poor uniformity brought in by imbalanced data distribution.
We propose targeted supervised contrastive learning (TSC), which improves the uniformity of the feature distribution on the hypersphere.
arXiv Detail & Related papers (2021-11-27T22:40:10Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.