Related papers: BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition

BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition

URL: http://arxiv.org/abs/2506.23280v1
Date: Sun, 29 Jun 2025 15:12:50 GMT
Title: BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition
Authors: Chaoqun Du, Yulin Wang, Shiji Song, Gao Huang,
Abstract summary: Current deep learning algorithms usually solve for the optimal classifier by emphimplicitly estimating the posterior probabilities.<n>This simple methodology has been proven effective for meticulously balanced academic benchmark datasets.<n>However, it is not applicable to the long-tailed data distributions in the real world.<n>This paper presents a novel approach (BAPE) that provides a more precise theoretical estimation of the data distributions.
Score: 78.70453964041718
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian decision theory advocates the Bayes classifier as the optimal approach for minimizing the risk in machine learning problems. Current deep learning algorithms usually solve for the optimal classifier by \emph{implicitly} estimating the posterior probabilities, \emph{e.g.}, by minimizing the Softmax cross-entropy loss. This simple methodology has been proven effective for meticulously balanced academic benchmark datasets. However, it is not applicable to the long-tailed data distributions in the real world, where it leads to the gradient imbalance issue and fails to ensure the Bayes optimal decision rule. To address these challenges, this paper presents a novel approach (BAPE) that provides a more precise theoretical estimation of the data distributions by \emph{explicitly} modeling the parameters of the posterior probabilities and solving them with point estimation. Consequently, our method directly learns the Bayes classifier without gradient descent based on Bayes' theorem, simultaneously alleviating the gradient imbalance and ensuring the Bayes optimal decision rule. Furthermore, we propose a straightforward yet effective \emph{distribution adjustment} technique. This method enables the Bayes classifier trained from the long-tailed training set to effectively adapt to the test data distribution with an arbitrary imbalance factor, thereby enhancing performance without incurring additional computational costs. In addition, we demonstrate the gains of our method are orthogonal to existing learning approaches for long-tailed scenarios, as they are mostly designed under the principle of \emph{implicitly} estimating the posterior probabilities. Extensive empirical evaluations on CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iNaturalist demonstrate that our method significantly improves the generalization performance of popular deep networks, despite its simplicity.

Related papers

Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models [56.92178753201331]
We tackle average-reward infinite-horizon POMDPs with an unknown transition model.<n>We present a novel and simple estimator that overcomes this barrier.
arXiv Detail & Related papers (2025-01-30T22:29:41Z)
Learning the Optimal Stopping for Early Classification within Finite Horizons via Sequential Probability Ratio Test [11.199585259018459]
Time-sensitive machine learning benefits from Sequential Probability Ratio Test (SPRT), which provides an optimal stopping time for early classification of time series.<n>In finite horizon scenarios, where input lengths are finite, determining the optimal stopping rule becomes computationally intensive due to the need for backward induction.<n>We introduce FIRMBOUND, an SPRT-based framework that efficiently estimates the solution to backward induction from training data.
arXiv Detail & Related papers (2025-01-29T23:54:46Z)
Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy [27.092821207089067]
We introduce the Bayes optimal learning threshold (BOLT) loss whose minimization enforces a classification model to achieve the Bayes error rate.<n> Numerical experiments demonstrate that models trained with BOLT achieve performance on par with or exceeding that of cross-entropy.
arXiv Detail & Related papers (2025-01-13T23:55:11Z)
Distributionally Robust Skeleton Learning of Discrete Bayesian Networks [9.46389554092506]
We consider the problem of learning the exact skeleton of general discrete Bayesian networks from potentially corrupted data. We propose to optimize the most adverse risk over a family of distributions within bounded Wasserstein distance or KL divergence to the empirical distribution. We present efficient algorithms and show the proposed methods are closely related to the standard regularized regression approach.
arXiv Detail & Related papers (2023-11-10T15:33:19Z)
Provably Efficient Bayesian Optimization with Unknown Gaussian Process Hyperparameter Estimation [44.53678257757108]
We propose a new BO method that can sub-linearly converge to the objective function's global optimum. Our method uses a multi-armed bandit technique (EXP3) to add random data points to the BO process. We demonstrate empirically that our method outperforms existing approaches on various synthetic and real-world problems.
arXiv Detail & Related papers (2023-06-12T03:35:45Z)
Scalable Bayesian Meta-Learning through Generalized Implicit Gradients [64.21628447579772]
Implicit Bayesian meta-learning (iBaML) method broadens the scope of learnable priors, but also quantifies the associated uncertainty. Analytical error bounds are established to demonstrate the precision and efficiency of the generalized implicit gradient over the explicit one.
arXiv Detail & Related papers (2023-03-31T02:10:30Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
Evaluating State-of-the-Art Classification Models Against Bayes Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows. We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z)
Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
Fair Classification via Unconstrained Optimization [0.0]
We show that the Bayes optimal fair learning rule remains a group-wise thresholding rule over the Bayes regressor. The proposed algorithm can be applied to any black-box machine learning model.
arXiv Detail & Related papers (2020-05-21T11:29:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.