Related papers: Provably Safeguarding a Classifier from OOD and Adversarial Samples: an Extreme Value Theory Approach

Provably Safeguarding a Classifier from OOD and Adversarial Samples: an Extreme Value Theory Approach

URL: http://arxiv.org/abs/2501.10202v1
Date: Fri, 17 Jan 2025 13:51:14 GMT
Title: Provably Safeguarding a Classifier from OOD and Adversarial Samples: an Extreme Value Theory Approach
Authors: Nicolas Atienza, Christophe Labreuche, Johanne Cohen, Michele Sebag,
Abstract summary: This paper introduces a novel method, Sample-efficient Probabilistic Detection using Extreme Value Theory (SPADE)<n>The approach is based on a Generalized Extreme Value (GEV) model of the training distribution in the classifier's latent space.<n>The abstaining classifier, which rejects samples based on their assessment, provably avoids adversarial samples.
Score: 2.5674049243330255
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces a novel method, Sample-efficient Probabilistic Detection using Extreme Value Theory (SPADE), which transforms a classifier into an abstaining classifier, offering provable protection against out-of-distribution and adversarial samples. The approach is based on a Generalized Extreme Value (GEV) model of the training distribution in the classifier's latent space, enabling the formal characterization of OOD samples. Interestingly, under mild assumptions, the GEV model also allows for formally characterizing adversarial samples. The abstaining classifier, which rejects samples based on their assessment by the GEV model, provably avoids OOD and adversarial samples. The empirical validation of the approach, conducted on various neural architectures (ResNet, VGG, and Vision Transformer) and medium and large-sized datasets (CIFAR-10, CIFAR-100, and ImageNet), demonstrates its frugality, stability, and efficiency compared to the state of the art.

Related papers

Credal Wrapper of Model Averaging for Uncertainty Estimation on Out-Of-Distribution Detection [5.19656787424626]
This paper presents an innovative approach, called credal wrapper, to formulating a credal set representation of model averaging for Bayesian neural networks (BNNs) and deep ensembles. Given a finite collection of single distributions derived from BNNs or deep ensembles, the proposed approach extracts an upper and a lower probability bound per class. Compared to BNN and deep ensemble baselines, the proposed credal representation methodology exhibits superior performance in uncertainty estimation.
arXiv Detail & Related papers (2024-05-23T20:51:22Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models [60.48306899271866]
We present a new framework, called GREAT Score, for global robustness evaluation of adversarial perturbation using generative models. We show high correlation and significantly reduced cost of GREAT Score when compared to the attack-based model ranking on RobustBench. GREAT Score can be used for remote auditing of privacy-sensitive black-box models.
arXiv Detail & Related papers (2023-04-19T14:58:27Z)
Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data. We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z)
fAux: Testing Individual Fairness via Gradient Alignment [2.5329739965085785]
We describe a new approach for testing individual fairness that does not have either requirement. We show that the proposed method effectively identifies discrimination on both synthetic and real-world datasets.
arXiv Detail & Related papers (2022-10-10T21:27:20Z)
Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning [54.61762276179205]
We propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples. Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples. We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
arXiv Detail & Related papers (2022-10-10T11:05:21Z)
Understanding, Detecting, and Separating Out-of-Distribution Samples and Adversarial Samples in Text Classification [80.81532239566992]
We compare the two types of anomalies (OOD and Adv samples) with the in-distribution (ID) ones from three aspects. We find that OOD samples expose their aberration starting from the first layer, while the abnormalities of Adv samples do not emerge until the deeper layers of the model. We propose a simple method to separate ID, OOD, and Adv samples using the hidden representations and output probabilities of the model.
arXiv Detail & Related papers (2022-04-09T12:11:59Z)
UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers trained via Conditional GANs [9.496524884855559]
We present an approach to quantifying uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs) Instead of shielding the entire in-distribution data with GAN generated OoD examples, we shield each class separately with out-of-class examples generated by a conditional GAN. In particular, we improve over the OoD detection and FP detection performance of state-of-the-art GAN-training based classifiers.
arXiv Detail & Related papers (2022-01-31T14:42:35Z)
WOOD: Wasserstein-based Out-of-Distribution Detection [6.163329453024915]
Training data for deep-neural-network-based classifiers are usually assumed to be sampled from the same distribution. When part of the test samples are drawn from a distribution that is far away from that of the training samples, the trained neural network has a tendency to make high confidence predictions for these OOD samples. We propose a Wasserstein-based out-of-distribution detection (WOOD) method to overcome these challenges.
arXiv Detail & Related papers (2021-12-13T02:35:15Z)
AdaPT-GMM: Powerful and robust covariate-assisted multiple testing [0.7614628596146599]
We propose a new empirical Bayes method for co-assisted multiple testing with false discovery rate (FDR) control. Our method refines the adaptive p-value thresholding (AdaPT) procedure by generalizing its masking scheme. We show in extensive simulations and real data examples that our new method, which we call AdaPT-GMM, consistently delivers high power.
arXiv Detail & Related papers (2021-06-30T05:06:18Z)
Label Smoothed Embedding Hypothesis for Out-of-Distribution Detection [72.35532598131176]
We propose an unsupervised method to detect OOD samples using a $k$-NN density estimate. We leverage a recent insight about label smoothing, which we call the emphLabel Smoothed Embedding Hypothesis We show that our proposal outperforms many OOD baselines and also provide new finite-sample high-probability statistical results.
arXiv Detail & Related papers (2021-02-09T21:04:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.