Reweighted Flow Matching via Unbalanced OT for Label-free Long-tailed Generation
- URL: http://arxiv.org/abs/2509.25713v1
- Date: Tue, 30 Sep 2025 03:19:52 GMT
- Title: Reweighted Flow Matching via Unbalanced OT for Label-free Long-tailed Generation
- Authors: Hyunsoo Song, Minjung Gim, Jaewoong Choi,
- Abstract summary: We propose Unbalanced Optimal Transport Reweighted Flow Matching (UOT-RFM) for long-tailed distributions.<n>Our method mitigates majority bias through a principled inverse reweighting strategy.<n>Our model outperforms existing flow matching baselines on long-tailed benchmarks.
- Score: 9.488297561764211
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Flow matching has recently emerged as a powerful framework for continuous-time generative modeling. However, when applied to long-tailed distributions, standard flow matching suffers from majority bias, producing minority modes with low fidelity and failing to match the true class proportions. In this work, we propose Unbalanced Optimal Transport Reweighted Flow Matching (UOT-RFM), a novel framework for generative modeling under class-imbalanced (long-tailed) distributions that operates without any class label information. Our method constructs the conditional vector field using mini-batch Unbalanced Optimal Transport (UOT) and mitigates majority bias through a principled inverse reweighting strategy. The reweighting relies on a label-free majority score, defined as the density ratio between the target distribution and the UOT marginal. This score quantifies the degree of majority based on the geometric structure of the data, without requiring class labels. By incorporating this score into the training objective, UOT-RFM theoretically recovers the target distribution with first-order correction ($k=1$) and empirically improves tail-class generation through higher-order corrections ($k > 1$). Our model outperforms existing flow matching baselines on long-tailed benchmarks, while maintaining competitive performance on balanced datasets.
Related papers
- Rebalancing with Calibrated Sub-classes (RCS): A Statistical Fusion-based Framework for Robust Imbalanced Classification across Modalities [16.993547305381327]
Rebalancing with Calibrated Sub-classes (RCS) is a novel distribution calibration framework for robust imbalanced classification.<n>RCS fuses statistical information from the majority and intermediate class distributions via a weighted mixture of Gaussian components.
arXiv Detail & Related papers (2025-10-10T00:06:13Z) - Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z) - FedGA: Federated Learning with Gradient Alignment for Error Asymmetry Mitigation [5.3663750040721085]
Federated learning (FL) triggers intra-client and inter-client class imbalance.<n>We propose a gradient alignment (GA)-informed FL method, dubbed as FedGA.
arXiv Detail & Related papers (2024-12-21T11:15:20Z) - Marginal Debiased Network for Fair Visual Recognition [59.05212866862219]
We propose a novel marginal debiased network (MDN) to learn debiased representations.
Our MDN can achieve a remarkable performance on under-represented samples.
arXiv Detail & Related papers (2024-01-04T08:57:09Z) - DR-Tune: Improving Fine-tuning of Pretrained Visual Models by
Distribution Regularization with Semantic Calibration [38.4461170690033]
We propose a novel fine-tuning framework, namely distribution regularization with semantic calibration (DR-Tune)
DR-Tune employs distribution regularization by enforcing the downstream task head to decrease its classification error on the pretrained feature distribution.
To alleviate the interference by semantic drift, we develop the semantic calibration (SC) module.
arXiv Detail & Related papers (2023-08-23T10:59:20Z) - Fair GANs through model rebalancing for extremely imbalanced class
distributions [5.463417677777276]
We present an approach to construct an unbiased generative adversarial network (GAN) from an existing biased GAN.
We show results for the StyleGAN2 models while training on the Flickr Faces High Quality (FFHQ) dataset for racial fairness.
We further validate our approach by applying it to an imbalanced CIFAR10 dataset which is also twice as large.
arXiv Detail & Related papers (2023-08-16T19:20:06Z) - Learning to Re-weight Examples with Optimal Transport for Imbalanced
Classification [74.62203971625173]
Imbalanced data pose challenges for deep learning based classification models.
One of the most widely-used approaches for tackling imbalanced data is re-weighting.
We propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view.
arXiv Detail & Related papers (2022-08-05T01:23:54Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Relieving Long-tailed Instance Segmentation via Pairwise Class Balance [85.53585498649252]
Long-tailed instance segmentation is a challenging task due to the extreme imbalance of training samples among classes.
It causes severe biases of the head classes (with majority samples) against the tailed ones.
We propose a novel Pairwise Class Balance (PCB) method, built upon a confusion matrix which is updated during training to accumulate the ongoing prediction preferences.
arXiv Detail & Related papers (2022-01-08T07:48:36Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.