Reducing Class-Wise Performance Disparity via Margin Regularization
- URL: http://arxiv.org/abs/2602.00205v1
- Date: Fri, 30 Jan 2026 12:56:08 GMT
- Title: Reducing Class-Wise Performance Disparity via Margin Regularization
- Authors: Beier Zhu, Kesen Zhao, Jiequan Cui, Qianru Sun, Yuan Zhou, Xun Yang, Hanwang Zhang,
- Abstract summary: Deep neural networks often exhibit substantial disparities in class-wise accuracy, even when trained on class-balanced data.<n>We present Margin Regularization for Performance Disparity Reduction (MR$2$), a theoretically principled regularization for classification.<n>Our analysis reveals how per-class feature variability contributes to error, motivating the use of larger margins for hard classes.
- Score: 82.81746960548382
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks often exhibit substantial disparities in class-wise accuracy, even when trained on class-balanced data, posing concerns for reliable deployment. While prior efforts have explored empirical remedies, a theoretical understanding of such performance disparities in classification remains limited. In this work, we present Margin Regularization for Performance Disparity Reduction (MR$^2$), a theoretically principled regularization for classification by dynamically adjusting margins in both the logit and representation spaces. Our analysis establishes a margin-based, class-sensitive generalization bound that reveals how per-class feature variability contributes to error, motivating the use of larger margins for hard classes. Guided by this insight, MR$^2$ optimizes per-class logit margins proportional to feature spread and penalizes excessive representation margins to enhance intra-class compactness. Experiments on seven datasets, including ImageNet, and diverse pre-trained backbones (MAE, MoCov2, CLIP) demonstrate that MR$^2$ not only improves overall accuracy but also significantly boosts hard class performance without trading off easy classes, thus reducing performance disparity. Code is available at: https://github.com/BeierZhu/MR2
Related papers
- Sculpting Margin Penalty: Intra-Task Adapter Merging and Classifier Calibration for Few-Shot Class-Incremental Learning [20.574528816984955]
Forward-compatible learning is a promising solution for Few-Shot Class-Incremental Learning (FSCIL)<n>We propose SMP (Sculpting Margin Penalty), a novel FSCIL method that strategically integrates margin penalties at different stages within the parameter-efficient fine-tuning paradigm.<n>We show that SMP achieves state-of-the-art performance in FSCIL while maintaining a better balance between base and new classes.
arXiv Detail & Related papers (2025-08-07T07:26:24Z) - Understanding the Detrimental Class-level Effects of Data Augmentation [63.1733767714073]
achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet.
We present a framework for understanding how DA interacts with class-level learning dynamics.
We show that simple class-conditional augmentation strategies improve performance on the negatively affected classes.
arXiv Detail & Related papers (2023-12-07T18:37:43Z) - Regularized Linear Regression for Binary Classification [20.710343135282116]
Regularized linear regression is a promising approach for binary classification problems in which the training set has noisy labels.
We show that for large enough regularization strength, the optimal weights concentrate around two values of opposite sign.
We observe that in many cases the corresponding "compression" of each weight to a single bit leads to very little loss in performance.
arXiv Detail & Related papers (2023-11-03T23:18:21Z) - Towards Calibrated Hyper-Sphere Representation via Distribution Overlap
Coefficient for Long-tailed Learning [8.208237033120492]
Long-tailed learning aims to tackle the challenge that head classes dominate the training procedure under severe class imbalance in real-world scenarios.
Motivated by this, we generalize the cosine-based classifiers to a von Mises-Fisher (vMF) mixture model.
We measure representation quality upon the hyper-sphere space via calculating distribution overlap coefficient.
arXiv Detail & Related papers (2022-08-22T03:53:29Z) - Learning Towards the Largest Margins [83.7763875464011]
Loss function should promote the largest possible margins for both classes and samples.
Not only does this principled framework offer new perspectives to understand and interpret existing margin-based losses, but it can guide the design of new tools.
arXiv Detail & Related papers (2022-06-23T10:03:03Z) - Distribution of Classification Margins: Are All Data Equal? [61.16681488656473]
We motivate theoretically and show empirically that the area under the curve of the margin distribution on the training set is in fact a good measure of generalization.
The resulting subset of "high capacity" features is not consistent across different training runs.
arXiv Detail & Related papers (2021-07-21T16:41:57Z) - Boosting Few-Shot Learning With Adaptive Margin Loss [109.03665126222619]
This paper proposes an adaptive margin principle to improve the generalization ability of metric-based meta-learning approaches for few-shot learning problems.
Extensive experiments demonstrate that the proposed method can boost the performance of current metric-based meta-learning approaches.
arXiv Detail & Related papers (2020-05-28T07:58:41Z) - Negative Margin Matters: Understanding Margin in Few-shot Classification [72.85978953262004]
This paper introduces a negative margin loss to metric learning based few-shot learning methods.
The negative margin loss significantly outperforms regular softmax loss, and state-of-the-art accuracy on three standard few-shot classification benchmarks.
arXiv Detail & Related papers (2020-03-26T17:59:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.