Related papers: Balanced Sharpness-Aware Minimization for Imbalanced Regression

Balanced Sharpness-Aware Minimization for Imbalanced Regression

URL: http://arxiv.org/abs/2508.16973v1
Date: Sat, 23 Aug 2025 09:57:07 GMT
Title: Balanced Sharpness-Aware Minimization for Imbalanced Regression
Authors: Yahao Liu, Qin Wang, Lixin Duan, Wen Li,
Abstract summary: Real-world data often exhibits imbalanced distribution, making regression models perform poorly especially for target values with rare observations.<n>We propose Balanced Sharpness-Aware Minimization(BSAM) to enforce the uniform generalization ability of regression models.<n>In particular, we start from the traditional sharpness-aware minimization and then introduce a novel targeted reweighting strategy to homogenize the generalization ability across the observation space.
Score: 29.03225426559032
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Regression is fundamental in computer vision and is widely used in various tasks including age estimation, depth estimation, target localization, \etc However, real-world data often exhibits imbalanced distribution, making regression models perform poorly especially for target values with rare observations~(known as the imbalanced regression problem). In this paper, we reframe imbalanced regression as an imbalanced generalization problem. To tackle that, we look into the loss sharpness property for measuring the generalization ability of regression models in the observation space. Namely, given a certain perturbation on the model parameters, we check how model performance changes according to the loss values of different target observations. We propose a simple yet effective approach called Balanced Sharpness-Aware Minimization~(BSAM) to enforce the uniform generalization ability of regression models for the entire observation space. In particular, we start from the traditional sharpness-aware minimization and then introduce a novel targeted reweighting strategy to homogenize the generalization ability across the observation space, which guarantees a theoretical generalization bound. Extensive experiments on multiple vision regression tasks, including age and depth estimation, demonstrate that our BSAM method consistently outperforms existing approaches. The code is available \href{https://github.com/manmanjun/BSAM_for_Imbalanced_Regression}{here}.

Related papers

CAIRO: Decoupling Order from Scale in Regression [13.755937210012883]
We propose a framework that decouples regression into two distinct stages.<n>In the first stage, we learn a scoring function by minimizing a scale-invariant ranking loss.<n>In the second, we recover the target scale via isotonic regression.
arXiv Detail & Related papers (2026-02-16T03:50:05Z)
Geometric Properties of Neural Multivariate Regression [3.259067345005505]
Collapsed models exhibit ID_H ID_Y, leading to over-compression and poor generalization.<n>We identify two regimes that determine when expanding or reducing feature dimensionality improves performance.
arXiv Detail & Related papers (2025-10-01T16:50:57Z)
Resampling strategies for imbalanced regression: a survey and empirical analysis [5.863538874435322]
Imbalanced problems can arise in different real-world situations, and to address this, certain strategies in the form of resampling or balancing algorithms are proposed.<n>This work presents an experimental study comprising various balancing and predictive models, and wich uses metrics to capture important elements for the user.<n>It also proposes a taxonomy for imbalanced regression approaches based on three crucial criteria: regression model, learning process, and evaluation metrics.
arXiv Detail & Related papers (2025-07-16T04:34:42Z)
Imbalance in Regression Datasets [0.9374652839580183]
We argue that imbalance in regression is an equally important problem which has so far been overlooked. Due to under- and over-representations in a data set's target distribution, regressors are prone to degenerate to naive models.
arXiv Detail & Related papers (2024-02-19T09:06:26Z)
Utilizing Multiple Inputs Autoregressive Models for Bearing Remaining Useful Life Prediction [3.448070371030467]
We introduce a novel multi-input autoregressive model to address this challenge in RUL prediction for bearings. Through autoregressive iterations, the model attains a global receptive field, effectively overcoming the limitations in generalization. Empirical evaluation on the PMH2012 dataset demonstrates that our model, compared to other backbone networks using similar autoregressive approaches, achieves significantly lower Root Mean Square Error (RMSE) and Score.
arXiv Detail & Related papers (2023-11-26T09:50:32Z)
Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes. The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range. We propose to construct hierarchical classifiers for solving imbalanced regression tasks. Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z)
Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer. The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z)
Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features. We find new and interesting properties that do not exist in single-task linear regression. Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z)
ResMem: Learn what you can and memorize the rest [79.19649788662511]
We propose the residual-memorization (ResMem) algorithm to augment an existing prediction model. By construction, ResMem can explicitly memorize the training labels. We show that ResMem consistently improves the test set generalization of the original prediction model.
arXiv Detail & Related papers (2023-02-03T07:12:55Z)
Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization [52.7137956951533]
We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research. We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift. Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
arXiv Detail & Related papers (2022-02-14T16:42:16Z)
Variation-Incentive Loss Re-weighting for Regression Analysis on Biased Data [8.115323786541078]
We aim to improve the accuracy of the regression analysis by addressing the data skewness/bias during model training. We propose a Variation-Incentive Loss re-weighting method (VILoss) to optimize the gradient descent-based model training for regression analysis.
arXiv Detail & Related papers (2021-09-14T10:22:21Z)
Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates [68.09049111171862]
This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates. We formulate the regression-free model updates into a constrained optimization problem. We empirically analyze how model ensemble reduces regression.
arXiv Detail & Related papers (2021-05-07T03:33:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.