Related papers: Entropic Out-of-Distribution Detection: Seamless Detection of Unknown Examples

Entropic Out-of-Distribution Detection: Seamless Detection of Unknown Examples

URL: http://arxiv.org/abs/2006.04005v3
Date: Wed, 4 Aug 2021 18:30:05 GMT
Title: Entropic Out-of-Distribution Detection: Seamless Detection of Unknown Examples
Authors: David Mac\^edo, Tsang Ing Ren, Cleber Zanchettin, Adriano L. I. Oliveira, Teresa Ludermir
Abstract summary: We propose replacing SoftMax loss with a novel loss function that does not suffer from the mentioned weaknesses. The proposed IsoMax loss is isotropic (exclusively distance-based) and provides high entropy posterior probability distributions. Our experiments showed that IsoMax loss works as a seamless SoftMax loss drop-in replacement that significantly improves neural networks' OOD detection performance.
Score: 8.284193221280214
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we argue that the unsatisfactory out-of-distribution (OOD) detection performance of neural networks is mainly due to the SoftMax loss anisotropy and propensity to produce low entropy probability distributions in disagreement with the principle of maximum entropy. Current out-of-distribution (OOD) detection approaches usually do not directly fix the SoftMax loss drawbacks, but rather build techniques to circumvent it. Unfortunately, those methods usually produce undesired side effects (e.g., classification accuracy drop, additional hyperparameters, slower inferences, and collecting extra data). In the opposite direction, we propose replacing SoftMax loss with a novel loss function that does not suffer from the mentioned weaknesses. The proposed IsoMax loss is isotropic (exclusively distance-based) and provides high entropy posterior probability distributions. Replacing the SoftMax loss by IsoMax loss requires no model or training changes. Additionally, the models trained with IsoMax loss produce as fast and energy-efficient inferences as those trained using SoftMax loss. Moreover, no classification accuracy drop is observed. The proposed method does not rely on outlier/background data, hyperparameter tuning, temperature calibration, feature extraction, metric learning, adversarial training, ensemble procedures, or generative models. Our experiments showed that IsoMax loss works as a seamless SoftMax loss drop-in replacement that significantly improves neural networks' OOD detection performance. Hence, it may be used as a baseline OOD detection approach to be combined with current or future OOD detection techniques to achieve even higher results.

Related papers

$ε$-Softmax: Approximating One-Hot Vectors for Mitigating Label Noise [99.91399796174602]
Noisy labels pose a common challenge for training accurate deep neural networks.<n>We propose $epsilon$-softmax, which modifies the outputs of the softmax layer to approximate one-hot vectors with a controllable error.<n>We prove theoretically that $epsilon$-softmax can achieve noise-tolerant learning with controllable excess risk bound for almost any loss function.
arXiv Detail & Related papers (2025-08-04T13:10:48Z)
Adaptive Sampled Softmax with Inverted Multi-Index: Methods, Theory and Applications [79.53938312089308]
The MIDX-Sampler is a novel adaptive sampling strategy based on an inverted multi-index approach. Our method is backed by rigorous theoretical analysis, addressing key concerns such as sampling bias, gradient bias, convergence rates, and generalization error bounds.
arXiv Detail & Related papers (2025-01-15T04:09:21Z)
In Defense of Softmax Parametrization for Calibrated and Consistent Learning to Defer [27.025808709031864]
It has been theoretically shown that popular estimators for learning to defer parameterized with softmax provide unbounded estimates for the likelihood of deferring. We show that the cause of the miscalibrated and unbounded estimator in prior literature is due to the symmetric nature of the surrogate losses used and not due to softmax. We propose a novel statistically consistent asymmetric softmax-based surrogate loss that can produce valid estimates without the issue of unboundedness.
arXiv Detail & Related papers (2023-11-02T09:15:52Z)
Spectral Aware Softmax for Visible-Infrared Person Re-Identification [123.69049942659285]
Visible-infrared person re-identification (VI-ReID) aims to match specific pedestrian images from different modalities. Existing methods still follow the softmax loss training paradigm, which is widely used in single-modality classification tasks. We propose the spectral-aware softmax (SA-Softmax) loss, which can fully explore the embedding space with the modality information.
arXiv Detail & Related papers (2023-02-03T02:57:18Z)
Distinction Maximization Loss: Efficiently Improving Classification Accuracy, Uncertainty Estimation, and Out-of-Distribution Detection Simply Replacing the Loss and Calibrating [2.262407399039118]
We propose training deterministic deep neural networks using our DisMax loss. DisMax usually outperforms all current approaches simultaneously in classification accuracy, uncertainty estimation, inference efficiency, and out-of-distribution detection.
arXiv Detail & Related papers (2022-05-12T04:37:35Z)
Robust Depth Completion with Uncertainty-Driven Loss Functions [60.9237639890582]
We introduce uncertainty-driven loss functions to improve the robustness of depth completion and handle the uncertainty in depth completion. Our method has been tested on KITTI Depth Completion Benchmark and achieved the state-of-the-art robustness performance in terms of MAE, IMAE, and IRMSE metrics.
arXiv Detail & Related papers (2021-12-15T05:22:34Z)
Understanding Softmax Confidence and Uncertainty [95.71801498763216]
It is often remarked that neural networks fail to increase their uncertainty when predicting on data far from the training distribution. Yet naively using softmax confidence as a proxy for uncertainty achieves modest success in tasks exclusively testing for this. This paper investigates this contradiction, identifying two implicit biases that do encourage softmax confidence to correlate with uncertainty.
arXiv Detail & Related papers (2021-06-09T10:37:29Z)
Improving Entropic Out-of-Distribution Detection using Isometric Distances and the Minimum Distance Score [0.0]
Entropic out-of-distribution detection solution comprises the IsoMax loss for training and the entropic score for out-of-distribution detection. We propose to perform an isometrization of the distances used in the IsoMax loss and replace the entropic score with the minimum distance score. Our experiments showed that these simple modifications increase out-of-distribution detection performance while keeping the solution seamless.
arXiv Detail & Related papers (2021-05-30T00:55:03Z)
Deterministic Neural Networks with Appropriate Inductive Biases Capture Epistemic and Aleatoric Uncertainty [91.01037972035635]
We show that a single softmax neural net with minimal changes can beat the uncertainty predictions of Deep Ensembles. We study why, and show that with the right inductive biases, softmax neural nets trained with maximum likelihood reliably capture uncertainty through the feature-space density.
arXiv Detail & Related papers (2021-02-23T09:44:09Z)
Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation. Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle. We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.