Controlled abstention neural networks for identifying skillful
predictions for classification problems
- URL: http://arxiv.org/abs/2104.08281v1
- Date: Fri, 16 Apr 2021 17:18:32 GMT
- Title: Controlled abstention neural networks for identifying skillful
predictions for classification problems
- Authors: Elizabeth A. Barnes and Randal J. Barnes
- Abstract summary: We introduce a novel loss function, termed the "NotWrong loss", that allows neural networks to identify forecasts of opportunity for classification problems.
The NotWrong loss is applied during training to preferentially learn from the more confident samples.
We show that the NotWrong loss outperforms other existing loss functions for multiple climate use cases.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The earth system is exceedingly complex and often chaotic in nature, making
prediction incredibly challenging: we cannot expect to make perfect predictions
all of the time. Instead, we look for specific states of the system that lead
to more predictable behavior than others, often termed "forecasts of
opportunity." When these opportunities are not present, scientists need
prediction systems that are capable of saying "I don't know." We introduce a
novel loss function, termed the "NotWrong loss", that allows neural networks to
identify forecasts of opportunity for classification problems. The NotWrong
loss introduces an abstention class that allows the network to identify the
more confident samples and abstain (say "I don't know") on the less confident
samples. The abstention loss is designed to abstain on a user-defined fraction
of the samples via a PID controller. Unlike many machine learning methods used
to reject samples post-training, the NotWrong loss is applied during training
to preferentially learn from the more confident samples. We show that the
NotWrong loss outperforms other existing loss functions for multiple climate
use cases. The implementation of the proposed loss function is straightforward
in most network architectures designed for classification as it only requires
the addition of an abstention class to the output layer and modification of the
loss function.
Related papers
- Extracting Usable Predictions from Quantized Networks through
Uncertainty Quantification for OOD Detection [0.0]
OOD detection has become more pertinent with advances in network design and increased task complexity.
We introduce an Uncertainty Quantification(UQ) technique to quantify the uncertainty in the predictions from a pre-trained vision model.
We observe that our technique saves up to 80% of ignored samples from being misclassified.
arXiv Detail & Related papers (2024-03-02T03:03:29Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - ZigZag: Universal Sampling-free Uncertainty Estimation Through Two-Step Inference [54.17205151960878]
We introduce a sampling-free approach that is generic and easy to deploy.
We produce reliable uncertainty estimates on par with state-of-the-art methods at a significantly lower computational cost.
arXiv Detail & Related papers (2022-11-21T13:23:09Z) - A heteroencoder architecture for prediction of failure locations in
porous metals using variational inference [1.2722697496405462]
We employ an encoder-decoder convolutional neural network to predict the failure locations of porous metal tension specimens.
The objective of predicting failure locations presents an extreme case of class imbalance since most of the material in the specimens do not fail.
We demonstrate that the resulting predicted variances are effective in ranking the locations that are most likely to fail in any given specimen.
arXiv Detail & Related papers (2022-01-31T20:26:53Z) - Mixing between the Cross Entropy and the Expectation Loss Terms [89.30385901335323]
Cross entropy loss tends to focus on hard to classify samples during training.
We show that adding to the optimization goal the expectation loss helps the network to achieve better accuracy.
Our experiments show that the new training protocol improves performance across a diverse set of classification domains.
arXiv Detail & Related papers (2021-09-12T23:14:06Z) - Omnipredictors [19.735769148626588]
Loss minimization is a dominant paradigm in machine learning.
We introduce the notion of an ($mathcalL,mathcalC$)-omnipredictor, which could be used to optimize any loss in a family.
We show that such "loss-oblivious'' learning is feasible through a connection to multicalibration.
arXiv Detail & Related papers (2021-09-11T23:28:49Z) - Center Prediction Loss for Re-identification [65.58923413172886]
We propose a new loss based on center predictivity, that is, a sample must be positioned in a location of the feature space such that from it we can roughly predict the location of the center of same-class samples.
We show that this new loss leads to a more flexible intra-class distribution constraint while ensuring the between-class samples are well-separated.
arXiv Detail & Related papers (2021-04-30T03:57:31Z) - Controlled abstention neural networks for identifying skillful
predictions for regression problems [0.0]
We introduce a novel loss function, termed "abstention loss", that allows neural networks to identify forecasts of opportunity for regression problems.
The abstention loss is applied during training to preferentially learn from the more confident samples.
arXiv Detail & Related papers (2021-04-16T17:16:32Z) - An Uncertainty-based Human-in-the-loop System for Industrial Tool Wear
Analysis [68.8204255655161]
We show that uncertainty measures based on Monte-Carlo dropout in the context of a human-in-the-loop system increase the system's transparency and performance.
A simulation study demonstrates that the uncertainty-based human-in-the-loop system increases performance for different levels of human involvement.
arXiv Detail & Related papers (2020-07-14T15:47:37Z) - Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples.
This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network.
Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.