Models of Computational Profiles to Study the Likelihood of DNN
Metamorphic Test Cases
- URL: http://arxiv.org/abs/2107.13491v1
- Date: Wed, 28 Jul 2021 16:57:44 GMT
- Title: Models of Computational Profiles to Study the Likelihood of DNN
Metamorphic Test Cases
- Authors: Ettore Merlo, Mira Marhaba, Foutse Khomh, Houssem Ben Braiek, Giuliano
Antoniol
- Abstract summary: We introduce "computational profiles" as vectors of neuron activation levels.
We show that the distribution of computational profile likelihood for training and test cases are somehow similar.
In contrast, metamorphic test cases show a prediction likelihood that lies in an extended range with respect to training, tests, and random noise.
- Score: 9.997379778870695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural network test cases are meant to exercise different reasoning paths in
an architecture and used to validate the prediction outcomes. In this paper, we
introduce "computational profiles" as vectors of neuron activation levels. We
investigate the distribution of computational profile likelihood of metamorphic
test cases with respect to the likelihood distributions of training, test and
error control cases. We estimate the non-parametric probability densities of
neuron activation levels for each distinct output class. Probabilities are
inferred using training cases only, without any additional knowledge about
metamorphic test cases. Experiments are performed by training a network on the
MNIST Fashion library of images and comparing prediction likelihoods with those
obtained from error control-data and from metamorphic test cases. Experimental
results show that the distributions of computational profile likelihood for
training and test cases are somehow similar, while the distribution of the
random-noise control-data is always remarkably lower than the observed one for
the training and testing sets. In contrast, metamorphic test cases show a
prediction likelihood that lies in an extended range with respect to training,
tests, and random noise. Moreover, the presented approach allows the
independent assessment of different training classes and experiments to show
that some of the classes are more sensitive to misclassifying metamorphic test
cases than other classes. In conclusion, metamorphic test cases represent very
aggressive tests for neural network architectures. Furthermore, since
metamorphic test cases force a network to misclassify those inputs whose
likelihood is similar to that of training cases, they could also be considered
as adversarial attacks that evade defenses based on computational profile
likelihood evaluation.
Related papers
- Detecting Model Misspecification in Amortized Bayesian Inference with Neural Networks: An Extended Investigation [9.950524371154394]
We propose a new misspecification measure that can be trained in an unsupervised fashion and reliably detects model misspecification at test time.
We show how the proposed misspecification test warns users about suspicious outputs, raises an alarm when predictions are not trustworthy, and guides model designers in their search for better simulators.
arXiv Detail & Related papers (2024-06-05T11:30:16Z) - Robust Nonparametric Hypothesis Testing to Understand Variability in
Training Neural Networks [5.8490454659691355]
We propose a new measure of closeness between classification models based on the output of the network before thresholding.
Our measure is based on a robust hypothesis-testing framework and can be adapted to other quantities derived from trained models.
arXiv Detail & Related papers (2023-10-01T01:44:35Z) - On the Variance of Neural Network Training with respect to Test Sets and Distributions [1.994307489466967]
We show that standard CIFAR-10 and ImageNet trainings have little variance in performance on the underlying test-distributions.
We prove that the variance of neural network trainings on their test-sets is a downstream consequence of the class-calibration property discovered by Jiang et al.
Our analysis yields a simple formula which accurately predicts variance for the classification case.
arXiv Detail & Related papers (2023-04-04T16:09:55Z) - A Learning Based Hypothesis Test for Harmful Covariate Shift [3.1406146587437904]
Machine learning systems in high-risk domains need to identify when predictions should not be made on out-of-distribution test examples.
In this work, we use the discordance between an ensemble of classifiers trained to agree on training data and disagree on test data to determine when a model should be removed from the deployment setting.
arXiv Detail & Related papers (2022-12-06T04:15:24Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - Tracking disease outbreaks from sparse data with Bayesian inference [55.82986443159948]
The COVID-19 pandemic provides new motivation for estimating the empirical rate of transmission during an outbreak.
Standard methods struggle to accommodate the partial observability and sparse data common at finer scales.
We propose a Bayesian framework which accommodates partial observability in a principled manner.
arXiv Detail & Related papers (2020-09-12T20:37:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.