Related papers: Why Out-of-distribution Detection in CNNs Does Not Like Mahalanobis -- and What to Use Instead

Why Out-of-distribution Detection in CNNs Does Not Like Mahalanobis -- and What to Use Instead

URL: http://arxiv.org/abs/2110.07043v1
Date: Wed, 13 Oct 2021 21:41:37 GMT
Title: Why Out-of-distribution Detection in CNNs Does Not Like Mahalanobis -- and What to Use Instead
Authors: Kamil Szyc, Tomasz Walkowiak, Henryk Maciejewski
Abstract summary: We show that in many OoD studies in high-dimensional data, LOF-based (Local Outlierness-Factor) methods outperform the parametric, Mahalanobis distance-based methods. This motivates us to propose the nonparametric, LOF-based method of generating the confidence scores for CNNs.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Convolutional neural networks applied for real-world classification tasks need to recognize inputs that are far or out-of-distribution (OoD) with respect to the known or training data. To achieve this, many methods estimate class-conditional posterior probabilities and use confidence scores obtained from the posterior distributions. Recent works propose to use multivariate Gaussian distributions as models of posterior distributions at different layers of the CNN (i.e., for low- and upper-level features), which leads to the confidence scores based on the Mahalanobis distance. However, this procedure involves estimating probability density in high dimensional data using the insufficient number of observations (e.g. the dimensionality of features at the last two layers in the ResNet-101 model are 2048 and 1024, with ca. 1000 observations per class used to estimate density). In this work, we want to address this problem. We show that in many OoD studies in high-dimensional data, LOF-based (Local Outlierness-Factor) methods outperform the parametric, Mahalanobis distance-based methods. This motivates us to propose the nonparametric, LOF-based method of generating the confidence scores for CNNs. We performed several feasibility studies involving ResNet-101 and EffcientNet-B3, based on CIFAR-10 and ImageNet (as known data), and CIFAR-100, SVHN, ImageNet2010, Places365, or ImageNet-O (as outliers). We demonstrated that nonparametric LOF-based confidence estimation can improve current Mahalanobis-based SOTA or obtain similar performance in a simpler way.

Related papers

Pruning Deep Convolutional Neural Network Using Conditional Mutual Information [10.302118493842647]
Convolutional Neural Networks (CNNs) achieve high performance in image classification tasks but are challenging to deploy on resource-limited hardware. We propose a structured filter-pruning approach for CNNs that identifies and selectively retains the most informative features in each layer.
arXiv Detail & Related papers (2024-11-27T18:23:59Z)
Credal Wrapper of Model Averaging for Uncertainty Estimation on Out-Of-Distribution Detection [5.19656787424626]
This paper presents an innovative approach, called credal wrapper, to formulating a credal set representation of model averaging for Bayesian neural networks (BNNs) and deep ensembles. Given a finite collection of single distributions derived from BNNs or deep ensembles, the proposed approach extracts an upper and a lower probability bound per class. Compared to BNN and deep ensemble baselines, the proposed credal representation methodology exhibits superior performance in uncertainty estimation.
arXiv Detail & Related papers (2024-05-23T20:51:22Z)
Fast, Distribution-free Predictive Inference for Neural Networks with Coverage Guarantees [25.798057062452443]
This paper introduces a novel, computationally-efficient algorithm for predictive inference (PI) It requires no distributional assumptions on the data and can be computed faster than existing bootstrap-type methods for neural networks.
arXiv Detail & Related papers (2023-06-11T04:03:58Z)
Evidence Networks: simple losses for fast, amortized, neural Bayesian model comparison [0.0]
Evidence Networks can enable Bayesian model comparison when state-of-the-art methods fail. We introduce the leaky parity-odd power transform, leading to the novel l-POP-Exponential'' loss function. We show that Evidence Networks are explicitly independent of dimensionality of the parameter space and scale mildly with the complexity of the posterior probability density function.
arXiv Detail & Related papers (2023-05-18T18:14:53Z)
NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z)
PDC-Net+: Enhanced Probabilistic Dense Correspondence Network [161.76275845530964]
Enhanced Probabilistic Dense Correspondence Network, PDC-Net+, capable of estimating accurate dense correspondences. We develop an architecture and an enhanced training strategy tailored for robust and generalizable uncertainty prediction. Our approach obtains state-of-the-art results on multiple challenging geometric matching and optical flow datasets.
arXiv Detail & Related papers (2021-09-28T17:56:41Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
Bayesian Neural Network via Stochastic Gradient Descent [0.0]
We show how gradient estimation can be applied on bayesian neural networks by gradient estimation techniques. Our work considerably beats the previous state of the art approaches for regression using bayesian neural networks.
arXiv Detail & Related papers (2020-06-04T18:33:59Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples. We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries. We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.