Related papers: Evaluating Deep Neural Networks in Deployment (A Comparative and Replicability Study)

Evaluating Deep Neural Networks in Deployment (A Comparative and Replicability Study)

URL: http://arxiv.org/abs/2407.08730v2
Date: Sat, 27 Jul 2024 18:27:05 GMT
Title: Evaluating Deep Neural Networks in Deployment (A Comparative and Replicability Study)
Authors: Eduard Pinconschi, Divya Gopinath, Rui Abreu, Corina S. Pasareanu,
Abstract summary: Deep neural networks (DNNs) are increasingly used in safety-critical applications. We study recent approaches that have been proposed to evaluate the reliability of DNNs in deployment. We find that it is hard to run and reproduce the results for these approaches on their replication packages and even more difficult to run them on artifacts other than their own.
Score: 11.242083685224554
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As deep neural networks (DNNs) are increasingly used in safety-critical applications, there is a growing concern for their reliability. Even highly trained, high-performant networks are not 100% accurate. However, it is very difficult to predict their behavior during deployment without ground truth. In this paper, we provide a comparative and replicability study on recent approaches that have been proposed to evaluate the reliability of DNNs in deployment. We find that it is hard to run and reproduce the results for these approaches on their replication packages and even more difficult to run them on artifacts other than their own. Further, it is difficult to compare the effectiveness of the approaches, due to the lack of clearly defined evaluation metrics. Our results indicate that more effort is needed in our research community to obtain sound techniques for evaluating the reliability of neural networks in safety-critical domains. To this end, we contribute an evaluation framework that incorporates the considered approaches and enables evaluation on common benchmarks, using common metrics.

Related papers

Risk-Averse Certification of Bayesian Neural Networks [70.44969603471903]
We propose a Risk-Averse Certification framework for Bayesian neural networks called RAC-BNN. Our method leverages sampling and optimisation to compute a sound approximation of the output set of a BNN. We validate RAC-BNN on a range of regression and classification benchmarks and compare its performance with a state-of-the-art method.
arXiv Detail & Related papers (2024-11-29T14:22:51Z)
Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly [79.07074710460012]
adversarial vulnerability of deep neural networks (DNNs) has drawn great attention. An increasing number of transfer-based methods have been developed to fool black-box DNN models. We establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods.
arXiv Detail & Related papers (2023-11-02T15:35:58Z)
APPRAISER: DNN Fault Resilience Analysis Employing Approximation Errors [1.1091582432763736]
Deep Neural Networks (DNNs) in safety-critical applications raise new reliability concerns. State-of-the-art methods for fault injection by emulation incur a spectrum of time-, design- and control-complexity problems. APPRAISER is proposed that applies functional approximation for a non-conventional purpose and employs approximate computing errors.
arXiv Detail & Related papers (2023-05-31T10:53:46Z)
Quantization-aware Interval Bound Propagation for Training Certifiably Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs) Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization. We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z)
Generalizability of Adversarial Robustness Under Distribution Shifts [57.767152566761304]
We take a first step towards investigating the interplay between empirical and certified adversarial robustness on one hand and domain generalization on another. We train robust models on multiple domains and evaluate their accuracy and robustness on an unseen domain. We extend our study to cover a real-world medical application, in which adversarial augmentation significantly boosts the generalization of robustness with minimal effect on clean data accuracy.
arXiv Detail & Related papers (2022-09-29T18:25:48Z)
On the Relationship Between Adversarial Robustness and Decision Region in Deep Neural Network [26.656444835709905]
We study the internal properties of Deep Neural Networks (DNNs) that affect model robustness under adversarial attacks. We propose the novel concept of the Populated Region Set (PRS), where training samples are populated more frequently.
arXiv Detail & Related papers (2022-07-07T16:06:34Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
On the Practicality of Deterministic Epistemic Uncertainty [106.06571981780591]
deterministic uncertainty methods (DUMs) achieve strong performance on detecting out-of-distribution data. It remains unclear whether DUMs are well calibrated and can seamlessly scale to real-world applications.
arXiv Detail & Related papers (2021-07-01T17:59:07Z)
Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks. This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network. Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z)
Data-Driven Assessment of Deep Neural Networks with Random Input Uncertainty [14.191310794366075]
We develop a data-driven optimization-based method capable of simultaneously certifying the safety of network outputs and localizing them. We experimentally demonstrate the efficacy and tractability of the method on a deep ReLU network.
arXiv Detail & Related papers (2020-10-02T19:13:35Z)
Confidence-Aware Learning for Deep Neural Networks [4.9812879456945]
We propose a method of training deep neural networks with a novel loss function, named Correctness Ranking Loss. It regularizes class probabilities explicitly to be better confidence estimates in terms of ordinal ranking according to confidence. It has almost the same computational costs for training as conventional deep classifiers and outputs reliable predictions by a single inference.
arXiv Detail & Related papers (2020-07-03T02:00:35Z)
Debona: Decoupled Boundary Network Analysis for Tighter Bounds and Faster Adversarial Robustness Proofs [2.1320960069210484]
Neural networks are commonly used in safety-critical real-world applications. Proving that either no such adversarial examples exist, or providing a concrete instance, is therefore crucial to ensure safe applications. We provide proofs for tight upper and lower bounds on max-pooling layers in convolutional networks.
arXiv Detail & Related papers (2020-06-16T10:00:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.