Related papers: Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks

Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks

URL: http://arxiv.org/abs/2311.03683v2
Date: Wed, 27 Mar 2024 16:44:22 GMT
Title: Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks
Authors: Ahmad Rashid, Serena Hacker, Guojun Zhang, Agustinus Kristiadi, Pascal Poupart,
Abstract summary: ReLU networks have been shown to almost always yield high confidence predictions when the test data are far away from the training set. We overcome this problem by adding a term to the output of the neural network that corresponds to the logit of an extra class. This technique provably prevents arbitrarily high confidence on far-away test data while maintaining a simple discriminative point-estimate training.
Score: 28.97655735976179
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Discriminatively trained, deterministic neural networks are the de facto choice for classification problems. However, even though they achieve state-of-the-art results on in-domain test sets, they tend to be overconfident on out-of-distribution (OOD) data. For instance, ReLU networks - a popular class of neural network architectures - have been shown to almost always yield high confidence predictions when the test data are far away from the training set, even when they are trained with OOD data. We overcome this problem by adding a term to the output of the neural network that corresponds to the logit of an extra class, that we design to dominate the logits of the original classes as we move away from the training data.This technique provably prevents arbitrarily high confidence on far-away test data while maintaining a simple discriminative point-estimate training. Evaluation on various benchmarks demonstrates strong performance against competitive baselines on both far-away and realistic OOD data.

Related papers

Refining Tuberculosis Detection in CXR Imaging: Addressing Bias in Deep Neural Networks via Interpretability [1.9936075659851882]
We argue that the reliability of deep learning models is limited, even if they can be shown to obtain perfect classification accuracy on the test data. We show that pre-training a deep neural network on a large-scale proxy task, as well as using mixed objective optimization network (MOON), can improve the alignment of decision foundations between models and experts.
arXiv Detail & Related papers (2024-07-19T06:41:31Z)
DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning [5.2319020651074215]
We propose a Curricumum-guided Contrastive Learning framework for neural Predictor (DCLP) Our method simplifies the contrastive task by designing a novel curriculum to enhance the stability of unlabeled training data distribution. We experimentally demonstrate that DCLP has high accuracy and efficiency compared with existing predictors.
arXiv Detail & Related papers (2023-02-25T08:16:21Z)
Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe. GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z)
Adversarial training with informed data selection [53.19381941131439]
Adrial training is the most efficient solution to defend the network against these malicious attacks. This work proposes a data selection strategy to be applied in the mini-batch training. The simulation results show that a good compromise can be obtained regarding robustness and standard accuracy.
arXiv Detail & Related papers (2023-01-07T12:09:50Z)
Being a Bit Frequentist Improves Bayesian Neural Networks [76.73339435080446]
We show that OOD-trained BNNs are competitive to, if not better than recent frequentist baselines. This work provides strong baselines for future work in both Bayesian and frequentist UQ.
arXiv Detail & Related papers (2021-06-18T11:22:42Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
Probing Predictions on OOD Images via Nearest Categories [97.055916832257]
We study out-of-distribution (OOD) prediction behavior of neural networks when they classify images from unseen classes or corrupted images. We introduce a new measure, nearest category generalization (NCG), where we compute the fraction of OOD inputs that are classified with the same label as their nearest neighbor in the training set. We find that robust networks have consistently higher NCG accuracy than natural training, even when the OOD data is much farther away than the robustness radius.
arXiv Detail & Related papers (2020-11-17T07:42:27Z)
Grasping Detection Network with Uncertainty Estimation for Confidence-Driven Semi-Supervised Domain Adaptation [17.16216430459064]
This paper presents an approach enabling the easy domain adaptation through a novel grasping detection network with confidence-driven semi-supervised learning. The proposed grasping detection network specially provides a prediction uncertainty estimation mechanism by leveraging on Feature Pyramid Network (FPN), and the mean-teacher semi-supervised learning utilizes such uncertainty information to emphasizing the consistency loss only for those unlabelled data with high confidence. Our results show that the proposed network can achieve high success rate on the Cornell grasping dataset, and for domain adaptation with very limited data, the confidence-driven mean teacher outperforms the original mean teacher and direct training by more than 10% in evaluation
arXiv Detail & Related papers (2020-08-20T07:42:45Z)
Confidence-Aware Learning for Deep Neural Networks [4.9812879456945]
We propose a method of training deep neural networks with a novel loss function, named Correctness Ranking Loss. It regularizes class probabilities explicitly to be better confidence estimates in terms of ordinal ranking according to confidence. It has almost the same computational costs for training as conventional deep classifiers and outputs reliable predictions by a single inference.
arXiv Detail & Related papers (2020-07-03T02:00:35Z)
Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions. We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples. We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.