Distinction Maximization Loss: Efficiently Improving Classification
Accuracy, Uncertainty Estimation, and Out-of-Distribution Detection Simply
Replacing the Loss and Calibrating
- URL: http://arxiv.org/abs/2205.05874v1
- Date: Thu, 12 May 2022 04:37:35 GMT
- Title: Distinction Maximization Loss: Efficiently Improving Classification
Accuracy, Uncertainty Estimation, and Out-of-Distribution Detection Simply
Replacing the Loss and Calibrating
- Authors: David Mac\^edo, Cleber Zanchettin, Teresa Ludermir
- Abstract summary: We propose training deterministic deep neural networks using our DisMax loss.
DisMax usually outperforms all current approaches simultaneously in classification accuracy, uncertainty estimation, inference efficiency, and out-of-distribution detection.
- Score: 2.262407399039118
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building robust deterministic deep neural networks is still a challenge. On
the one hand, some approaches improve out-of-distribution detection at the cost
of reducing classification accuracy in some situations. On the other hand, some
methods simultaneously increase classification accuracy, out-of-distribution
detection, and uncertainty estimation, but reduce inference efficiency, in
addition to training the same model many times to tune hyperparameters. In this
paper, we propose training deterministic deep neural networks using our DisMax
loss, which works as a drop-in replacement for the commonly used SoftMax loss
(i.e., the combination of the linear output layer, the SoftMax activation, and
the cross-entropy loss). Starting from the IsoMax+ loss, we created novel
logits that are based on the distance to all prototypes rather than just the
one associated with the correct class. We also propose a novel way to augment
images to construct what we call fractional probability regularization.
Moreover, we propose a new score to perform out-of-distribution detection and a
fast way to calibrate the network after training. Our experiments show that
DisMax usually outperforms all current approaches simultaneously in
classification accuracy, uncertainty estimation, inference efficiency, and
out-of-distribution detection, avoiding hyperparameter tuning and repetitive
model training. The code to replace the SoftMax loss with the DisMax loss and
reproduce the results in this paper is available at
https://github.com/dlmacedo/distinction-maximization-loss.
Related papers
- Dirichlet-Based Prediction Calibration for Learning with Noisy Labels [40.78497779769083]
Learning with noisy labels can significantly hinder the generalization performance of deep neural networks (DNNs)
Existing approaches address this issue through loss correction or example selection methods.
We propose the textitDirichlet-based Prediction (DPC) method as a solution.
arXiv Detail & Related papers (2024-01-13T12:33:04Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - A Softmax-free Loss Function Based on Predefined Optimal-distribution of
Latent Features for CNN Classifier [4.7210697296108926]
This article proposes a Softmax-free loss function (POD Loss) based on predefined optimal-distribution of latent features.
The loss function only restricts the latent features of the samples, including the cosine distance between the latent feature vector of the sample and the center of the predefined evenly-distributed class.
Compared with the commonly used Softmax Loss and the typical Softmax related AM-Softmax Loss, COT-Loss and PEDCC-Loss, experiments on several commonly used datasets on a typical network show that POD Loss is always better and easier to converge.
arXiv Detail & Related papers (2021-11-25T06:01:53Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Distribution Mismatch Correction for Improved Robustness in Deep Neural
Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions.
We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer.
In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z) - Improving Entropic Out-of-Distribution Detection using Isometric
Distances and the Minimum Distance Score [0.0]
Entropic out-of-distribution detection solution comprises the IsoMax loss for training and the entropic score for out-of-distribution detection.
We propose to perform an isometrization of the distances used in the IsoMax loss and replace the entropic score with the minimum distance score.
Our experiments showed that these simple modifications increase out-of-distribution detection performance while keeping the solution seamless.
arXiv Detail & Related papers (2021-05-30T00:55:03Z) - Shaping Deep Feature Space towards Gaussian Mixture for Visual
Classification [74.48695037007306]
We propose a Gaussian mixture (GM) loss function for deep neural networks for visual classification.
With a classification margin and a likelihood regularization, the GM loss facilitates both high classification performance and accurate modeling of the feature distribution.
The proposed model can be implemented easily and efficiently without using extra trainable parameters.
arXiv Detail & Related papers (2020-11-18T03:32:27Z) - Entropic Out-of-Distribution Detection: Seamless Detection of Unknown
Examples [8.284193221280214]
We propose replacing SoftMax loss with a novel loss function that does not suffer from the mentioned weaknesses.
The proposed IsoMax loss is isotropic (exclusively distance-based) and provides high entropy posterior probability distributions.
Our experiments showed that IsoMax loss works as a seamless SoftMax loss drop-in replacement that significantly improves neural networks' OOD detection performance.
arXiv Detail & Related papers (2020-06-07T00:34:57Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z) - MaxUp: A Simple Way to Improve Generalization of Neural Network Training [41.89570630848936]
emphMaxUp is an embarrassingly simple, highly effective technique for improving the generalization performance of machine learning models.
In particular, we improve ImageNet classification from the state-of-the-art top-1 accuracy $85.5%$ without extra data to $85.8%$.
arXiv Detail & Related papers (2020-02-20T21:20:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.