Related papers: Generating and Detecting True Ambiguity: A Forgotten Danger in DNN Supervision Testing

Generating and Detecting True Ambiguity: A Forgotten Danger in DNN Supervision Testing

URL: http://arxiv.org/abs/2207.10495v2
Date: Fri, 8 Sep 2023 05:57:49 GMT
Title: Generating and Detecting True Ambiguity: A Forgotten Danger in DNN Supervision Testing
Authors: Michael Weiss, Andr\'e Garc\'ia G\'omez, Paolo Tonella
Abstract summary: We propose a novel way to generate ambiguous inputs to test Deep Neural Networks (DNNs) In particular, we propose AmbiGuess to generate ambiguous samples for image classification problems. We find that those best suited to detect true ambiguity perform worse on invalid, out-of-distribution and adversarial inputs and vice-versa.
Score: 8.210473195536077
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Neural Networks (DNNs) are becoming a crucial component of modern software systems, but they are prone to fail under conditions that are different from the ones observed during training (out-of-distribution inputs) or on inputs that are truly ambiguous, i.e., inputs that admit multiple classes with nonzero probability in their labels. Recent work proposed DNN supervisors to detect high-uncertainty inputs before their possible misclassification leads to any harm. To test and compare the capabilities of DNN supervisors, researchers proposed test generation techniques, to focus the testing effort on high-uncertainty inputs that should be recognized as anomalous by supervisors. However, existing test generators aim to produce out-of-distribution inputs. No existing model- and supervisor independent technique targets the generation of truly ambiguous test inputs, i.e., inputs that admit multiple classes according to expert human judgment. In this paper, we propose a novel way to generate ambiguous inputs to test DNN supervisors and used it to empirically compare several existing supervisor techniques. In particular, we propose AmbiGuess to generate ambiguous samples for image classification problems. AmbiGuess is based on gradient-guided sampling in the latent space of a regularized adversarial autoencoder. Moreover, we conducted what is -- to the best of our knowledge -- the most extensive comparative study of DNN supervisors, considering their capabilities to detect 4 distinct types of high-uncertainty inputs, including truly ambiguous ones. We find that the tested supervisors' capabilities are complementary: Those best suited to detect true ambiguity perform worse on invalid, out-of-distribution and adversarial inputs and vice-versa.

Related papers

Generative Edge Detection with Stable Diffusion [52.870631376660924]
Edge detection is typically viewed as a pixel-level classification problem mainly addressed by discriminative methods. We propose a novel approach, named Generative Edge Detector (GED), by fully utilizing the potential of the pre-trained stable diffusion model. We conduct extensive experiments on multiple datasets and achieve competitive performance.
arXiv Detail & Related papers (2024-10-04T01:52:23Z)
DeepSample: DNN sampling-based testing for operational accuracy assessment [12.029919627622954]
Deep Neural Networks (DNN) are core components for classification and regression tasks of many software systems. The challenge is to select a representative set of test inputs as small as possible to reduce the labelling cost. This study presents DeepSample, a family of DNN testing techniques for cost-effective accuracy assessment.
arXiv Detail & Related papers (2024-03-28T09:56:26Z)
Unsupervised Continual Anomaly Detection with Contrastively-learned Prompt [80.43623986759691]
We introduce a novel Unsupervised Continual Anomaly Detection framework called UCAD. The framework equips the UAD with continual learning capability through contrastively-learned prompts. We conduct comprehensive experiments and set the benchmark on unsupervised continual anomaly detection and segmentation.
arXiv Detail & Related papers (2024-01-02T03:37:11Z)
Rethinking Diversity in Deep Neural Network Testing [25.641743200458382]
We propose a shift in perspective for testing deep neural networks (DNNs) We advocate for the consideration of DNN testing as directed testing problems rather than diversity-based testing tasks. Our evaluation demonstrates that diversity metrics are particularly weak indicators for identifying buggy inputs resulting from small input perturbations.
arXiv Detail & Related papers (2023-05-25T04:13:51Z)
The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property. We propose a novel approach that returns the exact count of violations. We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z)
Uncertainty Quantification for Deep Neural Networks: An Empirical Comparison and Usage Guidelines [4.987581730476023]
Deep Neural Networks (DNN) are increasingly used as components of larger software systems that need to process complex data. Deep Learning based System (DLS) that implement a supervisor by means of uncertainty estimation.
arXiv Detail & Related papers (2022-12-14T09:12:30Z)
W2N:Switching From Weak Supervision to Noisy Supervision for Object Detection [64.10643170523414]
We propose a novel WSOD framework with a new paradigm that switches from weak supervision to noisy supervision (W2N) In the localization adaptation module, we propose a regularization loss to reduce the proportion of discriminative parts in original pseudo ground-truths. Our W2N outperforms all existing pure WSOD methods and transfer learning methods.
arXiv Detail & Related papers (2022-07-25T12:13:48Z)
Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training. We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z)
Distribution-Aware Testing of Neural Networks Using Generative Models [5.618419134365903]
The reliability of software that has a Deep Neural Network (DNN) as a component is urgently important. We show that three recent testing techniques generate significant number of invalid test inputs. We propose a technique to incorporate the valid input space of the DNN model under test in the test generation process.
arXiv Detail & Related papers (2021-02-26T17:18:21Z)
Fail-Safe Execution of Deep Learning based Systems through Uncertainty Monitoring [4.56877715768796]
A fail-safe Deep Learning based System (DLS) is equipped to handle DNN faults by means of a supervisor. We propose an approach to use DNN uncertainty estimators to implement such a supervisor. We describe our publicly available tool UNCERTAINTY-WIZARD, which allows transparent estimation of uncertainty for regular tf.keras DNNs.
arXiv Detail & Related papers (2021-02-01T15:22:54Z)
A Survey on Assessing the Generalization Envelope of Deep Neural Networks: Predictive Uncertainty, Out-of-distribution and Adversarial Samples [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art performance on numerous applications. It is difficult to tell beforehand if a DNN receiving an input will deliver the correct output since their decision criteria are usually nontransparent. This survey connects the three fields within the larger framework of investigating the generalization performance of machine learning methods and in particular DNNs.
arXiv Detail & Related papers (2020-08-21T09:12:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.