Related papers: Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy

Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy

URL: http://arxiv.org/abs/2501.07754v1
Date: Mon, 13 Jan 2025 23:55:11 GMT
Title: Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy
Authors: Mohammadreza Tavasoli Naeini, Ali Bereyhi, Morteza Noshad, Ben Liang, Alfred O. Hero III,
Abstract summary: We introduce the Bayes optimal learning threshold (BOLT) loss whose minimization enforces a classification model to achieve the Bayes error rate.<n> Numerical experiments demonstrate that models trained with BOLT achieve performance on par with or exceeding that of cross-entropy.
Score: 27.092821207089067
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work invokes the notion of $f$-divergence to introduce a novel upper bound on the Bayes error rate of a general classification task. We show that the proposed bound can be computed by sampling from the output of a parameterized model. Using this practical interpretation, we introduce the Bayes optimal learning threshold (BOLT) loss whose minimization enforces a classification model to achieve the Bayes error rate. We validate the proposed loss for image and text classification tasks, considering MNIST, Fashion-MNIST, CIFAR-10, and IMDb datasets. Numerical experiments demonstrate that models trained with BOLT achieve performance on par with or exceeding that of cross-entropy, particularly on challenging datasets. This highlights the potential of BOLT in improving generalization.

Related papers

BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition [78.70453964041718]
Current deep learning algorithms usually solve for the optimal classifier by emphimplicitly estimating the posterior probabilities.<n>This simple methodology has been proven effective for meticulously balanced academic benchmark datasets.<n>However, it is not applicable to the long-tailed data distributions in the real world.<n>This paper presents a novel approach (BAPE) that provides a more precise theoretical estimation of the data distributions.
arXiv Detail & Related papers (2025-06-29T15:12:50Z)
Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing [58.52119063742121]
Retraining a model using its own predictions together with the original, potentially noisy labels is a well-known strategy for improving the model performance.<n>This paper addresses the question of how to optimally combine the model's predictions and the provided labels.<n>Our main contribution is the derivation of the Bayes optimal aggregator function to combine the current model's predictions and the given labels.
arXiv Detail & Related papers (2025-05-21T07:16:44Z)
SINDER: Repairing the Singular Defects of DINOv2 [61.98878352956125]
Vision Transformer models trained on large-scale datasets often exhibit artifacts in the patch token they extract. We propose a novel fine-tuning smooth regularization that rectifies structural deficiencies using only a small dataset.
arXiv Detail & Related papers (2024-07-23T20:34:23Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Bayesian Cramér-Rao Bound Estimation with Score-Based Models [3.4480437706804503]
The Bayesian Cram'er-Rao bound (CRB) provides a lower bound on the mean square error of any Bayesian estimator under mild regularity conditions. This work introduces a new data-driven estimator for the CRB using score matching.
arXiv Detail & Related papers (2023-09-28T00:22:21Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Improved Visual Fine-tuning with Natural Language Supervision [36.250244364023665]
Fine-tuning a visual pre-trained model can leverage the semantic information from large-scale pre-training data. The problem of catastrophic forgetting in pre-trained backbone has been extensively studied for fine-tuning. We introduce a reference distribution obtained from a fixed text classifier, which can help regularize the learned vision classifier.
arXiv Detail & Related papers (2023-04-04T03:08:02Z)
Bayesian Layer Graph Convolutioanl Network for Hyperspetral Image Classification [24.91896527342631]
Graph convolutional network (GCN) based models have shown impressive performance. Deep learning frameworks based on point estimation suffer from low generalization and inability to quantify the classification results uncertainty. In this paper, we propose a Bayesian layer with Bayesian idea as an insertion layer into point estimation based neural networks. A Generative Adversarial Network (GAN) is built to solve the sample imbalance problem of HSI dataset.
arXiv Detail & Related papers (2022-11-14T12:56:56Z)
Fine-grained Retrieval Prompt Tuning [149.9071858259279]
Fine-grained Retrieval Prompt Tuning steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompt and feature adaptation. Our FRPT with fewer learnable parameters achieves the state-of-the-art performance on three widely-used fine-grained datasets.
arXiv Detail & Related papers (2022-07-29T04:10:04Z)
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification [86.32752788233913]
In classification problems, the Bayes error can be used as a criterion to evaluate classifiers with state-of-the-art performance. We propose a simple and direct Bayes error estimator, where we just take the mean of the labels that show emphuncertainty of the classes. Our flexible approach enables us to perform Bayes error estimation even for weakly supervised data.
arXiv Detail & Related papers (2022-02-01T13:22:26Z)
Influential Rank: A New Perspective of Post-training for Robust Model against Noisy Labels [23.80449026013167]
We propose a new approach for learning from noisy labels (LNL) via post-training. We exploit the overfitting property of a trained model to identify mislabeled samples. Our post-training approach creates great synergies when combined with the existing LNL methods.
arXiv Detail & Related papers (2021-06-14T08:04:18Z)
Evaluating State-of-the-Art Classification Models Against Bayes Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows. We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z)
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation [37.054709598792165]
The model is a convolutional neural network that operates directly on the raw waveform. It is optimized to identify spectral changes in the signal using the Noise-Contrastive Estimation principle. At test time, a peak detection algorithm is applied over the model outputs to produce the final boundaries.
arXiv Detail & Related papers (2020-07-27T12:10:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.