Learning Majority-to-Minority Transformations with MMD and Triplet Loss for Imbalanced Classification
- URL: http://arxiv.org/abs/2509.11511v1
- Date: Mon, 15 Sep 2025 01:47:29 GMT
- Title: Learning Majority-to-Minority Transformations with MMD and Triplet Loss for Imbalanced Classification
- Authors: Suman Cha, Hyunjoong Kim,
- Abstract summary: Class imbalance in supervised classification often degrades model performance by biasing predictions toward the majority class.<n>We introduce an oversampling framework that learns a parametric transformation to map majority samples into the minority distribution.<n>Our approach minimizes the mean maximum discrepancy (MMD) between transformed and true minority samples for global alignment.
- Score: 0.5390869741300152
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Class imbalance in supervised classification often degrades model performance by biasing predictions toward the majority class, particularly in critical applications such as medical diagnosis and fraud detection. Traditional oversampling techniques, including SMOTE and its variants, generate synthetic minority samples via local interpolation but fail to capture global data distributions in high-dimensional spaces. Deep generative models based on GANs offer richer distribution modeling yet suffer from training instability and mode collapse under severe imbalance. To overcome these limitations, we introduce an oversampling framework that learns a parametric transformation to map majority samples into the minority distribution. Our approach minimizes the maximum mean discrepancy (MMD) between transformed and true minority samples for global alignment, and incorporates a triplet loss regularizer to enforce boundary awareness by guiding synthesized samples toward challenging borderline regions. We evaluate our method on 29 synthetic and real-world datasets, demonstrating consistent improvements over classical and generative baselines in AUROC, G-mean, F1-score, and MCC. These results confirm the robustness, computational efficiency, and practical utility of the proposed framework for imbalanced classification tasks.
Related papers
- Generate Aligned Anomaly: Region-Guided Few-Shot Anomaly Image-Mask Pair Synthesis for Industrial Inspection [53.137651284042434]
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples limits the effectiveness of existing methods.<n>We propose Generate grained Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework.<n>GAA generates realistic, diverse, and semantically aligned anomalies using only a small number of samples.
arXiv Detail & Related papers (2025-07-13T12:56:59Z) - SMOGAN: Synthetic Minority Oversampling with GAN Refinement for Imbalanced Regression [0.0]
Imbalanced regression refers to prediction tasks where the target variable is skewed.<n>This skewness hinders machine learning models, especially neural networks, which concentrate on dense regions.<n>We propose SMOGAN, a two-step oversampling framework for imbalanced regression.
arXiv Detail & Related papers (2025-04-29T20:15:25Z) - Addressing Class Imbalance with Probabilistic Graphical Models and Variational Inference [10.457756074328664]
This study proposes a method for imbalanced data classification based on deep probabilistic graphical models (DPGMs)<n>We introduce variational inference optimization probability modeling, which enables the model to adaptively adjust the representation ability of minority classes.<n>We combine the adversarial learning mechanism to generate minority class samples in the latent space so that the model can better characterize the category boundary.
arXiv Detail & Related papers (2025-04-08T07:38:30Z) - Unbiased Max-Min Embedding Classification for Transductive Few-Shot Learning: Clustering and Classification Are All You Need [83.10178754323955]
Few-shot learning enables models to generalize from only a few labeled examples.<n>We propose the Unbiased Max-Min Embedding Classification (UMMEC) Method, which addresses the key challenges in few-shot learning.<n>Our method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in annotatedL.
arXiv Detail & Related papers (2025-03-28T07:23:07Z) - A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment
for Imbalanced Learning [129.63326990812234]
We propose a technique named data-dependent contraction to capture how modified losses handle different classes.
On top of this technique, a fine-grained generalization bound is established for imbalanced learning, which helps reveal the mystery of re-weighting and logit-adjustment.
arXiv Detail & Related papers (2023-10-07T09:15:08Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Imbalanced Classification via a Tabular Translation GAN [4.864819846886142]
We present a model based on Generative Adversarial Networks which uses additional regularization losses to map majority samples to corresponding synthetic minority samples.
We show that the proposed method improves average precision when compared to alternative re-weighting and oversampling techniques.
arXiv Detail & Related papers (2022-04-19T06:02:53Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - A Novel Adaptive Minority Oversampling Technique for Improved
Classification in Data Imbalanced Scenarios [23.257891827728827]
Imbalance in the proportion of training samples belonging to different classes often poses performance degradation of conventional classifiers.
We propose a novel three step technique to address imbalanced data.
arXiv Detail & Related papers (2021-03-24T09:58:02Z) - Conditional Wasserstein GAN-based Oversampling of Tabular Data for
Imbalanced Learning [10.051309746913512]
We propose an oversampling method based on a conditional Wasserstein GAN.
We benchmark our method against standard oversampling methods and the imbalanced baseline on seven real-world datasets.
arXiv Detail & Related papers (2020-08-20T20:33:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.