Online Bayesian Imbalanced Learning with Bregman-Calibrated Deep Networks
- URL: http://arxiv.org/abs/2602.08128v1
- Date: Sun, 08 Feb 2026 21:23:00 GMT
- Title: Online Bayesian Imbalanced Learning with Bregman-Calibrated Deep Networks
- Authors: Zahir Alsulaimawi,
- Abstract summary: We present textitOnline Bayesian Imbalanced Learning (OBIL), a principled framework that decouples likelihood-ratio estimation from class-prior assumptions.<n>Our approach builds on the established connection between Bregman divergences and proper scoring rules to show that deep networks trained with such losses produce posterior probability estimates.<n>We prove that these likelihood-ratio estimates remain valid under arbitrary changes in class priors and cost structures, requiring only a threshold adjustment for optimal Bayes decisions.
- Score: 0.7106986689736825
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Class imbalance remains a fundamental challenge in machine learning, where standard classifiers exhibit severe performance degradation in minority classes. Although existing approaches address imbalance through resampling or cost-sensitive learning during training, they require retraining or access to labeled target data when class distributions shift at deployment time, a common occurrence in real-world applications such as fraud detection, medical diagnosis, and anomaly detection. We present \textit{Online Bayesian Imbalanced Learning} (OBIL), a principled framework that decouples likelihood-ratio estimation from class-prior assumptions, enabling real-time adaptation to distribution shifts without model retraining. Our approach builds on the established connection between Bregman divergences and proper scoring rules to show that deep networks trained with such losses produce posterior probability estimates from which prior-invariant likelihood ratios can be extracted. We prove that these likelihood-ratio estimates remain valid under arbitrary changes in class priors and cost structures, requiring only a threshold adjustment for optimal Bayes decisions. We derive finite-sample regret bounds demonstrating that OBIL achieves $O(\sqrt{T \log T})$ regret against an oracle with perfect prior knowledge. Extensive experiments on benchmark datasets and medical diagnosis benchmarks under simulated deployment shifts demonstrate that OBIL maintains robust performance under severe distribution shifts, outperforming state-of-the-art methods in F1 Score when test distributions deviate significantly from the training conditions.
Related papers
- Optimal training-conditional regret for online conformal prediction [20.643619398558315]
We study online conformal prediction for non-stationary data streams subject to unknown distribution drift.<n>We specifically focus on independently generated data with two types of distribution shift: abrupt change points and smooth drift.<n>We establish non-asymptotic regret guarantees for our online full conformal algorithm, which match the minimax lower bound under appropriate restrictions on the prediction sets.
arXiv Detail & Related papers (2026-02-18T15:31:15Z) - Class Confidence Aware Reweighting for Long Tailed Learning [0.8297806372438926]
We present the design of a class and confidence-aware re-weighting scheme for long-tailed learning.<n>We use an (p_t, f_c) function to enable the modulation of the contribution towards the training task based upon the confidence value of the prediction.
arXiv Detail & Related papers (2026-01-22T12:58:05Z) - Bayesian Test-time Adaptation for Object Recognition and Detection with Vision-language Models [86.53246292425699]
We present BCA+, a training-free framework for TTA for both object recognition and detection.<n>We formulate adaptation as a Bayesian inference problem, where final predictions are generated by fusing the initial VLM output with a cache-based prediction.<n>BCA+ achieves state-of-the-art performance on both recognition and detection benchmarks.
arXiv Detail & Related papers (2025-10-03T06:27:33Z) - Backpropagation-Free Test-Time Adaptation via Probabilistic Gaussian Alignment [16.352863226512984]
Test-time adaptation (TTA) enhances the zero-shot robustness under distribution shifts by leveraging unlabeled test data during inference.<n>Most methods rely on backpropagation or iterative optimization, which limits scalability and hinders real-time deployment.<n>We propose ADAPT, an Advanced Distribution-Aware and back propagation-free Test-time adaptation method.
arXiv Detail & Related papers (2025-08-21T13:42:49Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Bias Amplification Enhances Minority Group Performance [10.380812738348899]
We propose BAM, a novel two-stage training algorithm.
In the first stage, the model is trained using a bias amplification scheme via introducing a learnable auxiliary variable for each training sample.
In the second stage, we upweight the samples that the bias-amplified model misclassifies, and then continue training the same model on the reweighted dataset.
arXiv Detail & Related papers (2023-09-13T04:40:08Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - Semantic Self-adaptation: Enhancing Generalization with a Single Sample [45.111358665370524]
We propose a self-adaptive approach for semantic segmentation.
It fine-tunes the parameters of convolutional layers to the input image using consistency regularization.
Our empirical study suggests that self-adaptation may complement the established practice of model regularization at training time.
arXiv Detail & Related papers (2022-08-10T12:29:01Z) - CAFA: Class-Aware Feature Alignment for Test-Time Adaptation [50.26963784271912]
Test-time adaptation (TTA) aims to address this challenge by adapting a model to unlabeled data at test time.
We propose a simple yet effective feature alignment loss, termed as Class-Aware Feature Alignment (CAFA), which simultaneously encourages a model to learn target representations in a class-discriminative manner.
arXiv Detail & Related papers (2022-06-01T03:02:07Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.