Related papers: DeepSample: DNN sampling-based testing for operational accuracy assessment

DeepSample: DNN sampling-based testing for operational accuracy assessment

URL: http://arxiv.org/abs/2403.19271v1
Date: Thu, 28 Mar 2024 09:56:26 GMT
Title: DeepSample: DNN sampling-based testing for operational accuracy assessment
Authors: Antonio Guerriero, Roberto Pietrantuono, Stefano Russo,
Abstract summary: Deep Neural Networks (DNN) are core components for classification and regression tasks of many software systems. The challenge is to select a representative set of test inputs as small as possible to reduce the labelling cost. This study presents DeepSample, a family of DNN testing techniques for cost-effective accuracy assessment.
Score: 12.029919627622954
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Neural Networks (DNN) are core components for classification and regression tasks of many software systems. Companies incur in high costs for testing DNN with datasets representative of the inputs expected in operation, as these need to be manually labelled. The challenge is to select a representative set of test inputs as small as possible to reduce the labelling cost, while sufficing to yield unbiased high-confidence estimates of the expected DNN accuracy. At the same time, testers are interested in exposing as many DNN mispredictions as possible to improve the DNN, ending up in the need for techniques pursuing a threefold aim: small dataset size, trustworthy estimates, mispredictions exposure. This study presents DeepSample, a family of DNN testing techniques for cost-effective accuracy assessment based on probabilistic sampling. We investigate whether, to what extent, and under which conditions probabilistic sampling can help to tackle the outlined challenge. We implement five new sampling-based testing techniques, and perform a comprehensive comparison of such techniques and of three further state-of-the-art techniques for both DNN classification and regression tasks. Results serve as guidance for best use of sampling-based testing for faithful and high-confidence estimates of DNN accuracy in operation at low cost.

Related papers

Make Me a BNN: A Simple Strategy for Estimating Bayesian Uncertainty from Pre-trained Models [40.38541033389344]
Deep Neural Networks (DNNs) are powerful tools for various computer vision tasks, yet they often struggle with reliable uncertainty quantification. We introduce the Adaptable Bayesian Neural Network (ABNN), a simple and scalable strategy to seamlessly transform DNNs into BNNs. We conduct extensive experiments across multiple datasets for image classification and semantic segmentation tasks, and our results demonstrate that ABNN achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-12-23T16:39:24Z)
TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural Networks [4.528286105252983]
TEASMA is a comprehensive and practical methodology designed to accurately assess the adequacy of test sets for Deep Neural Networks. We evaluate TEASMA with four state-of-the-art test adequacy metrics: Distance-based Surprise Coverage (DSC), Likelihood-based Surprise Coverage (LSC), Input Distribution Coverage (IDC) and Mutation Score (MS)
arXiv Detail & Related papers (2023-08-02T17:56:05Z)
DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep Neural Networks [0.6249768559720121]
DeepGD is a black-box multi-objective test selection approach for Deep neural networks (DNNs) It reduces the cost of labeling by prioritizing the selection of test inputs with high fault revealing power from large unlabeled datasets.
arXiv Detail & Related papers (2023-03-08T20:33:09Z)
The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property. We propose a novel approach that returns the exact count of violations. We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z)
ALT-MAS: A Data-Efficient Framework for Active Testing of Machine Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data. The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)
A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data. In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem. We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z)
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution. Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs. Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z)
A Survey on Assessing the Generalization Envelope of Deep Neural Networks: Predictive Uncertainty, Out-of-distribution and Adversarial Samples [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art performance on numerous applications. It is difficult to tell beforehand if a DNN receiving an input will deliver the correct output since their decision criteria are usually nontransparent. This survey connects the three fields within the larger framework of investigating the generalization performance of machine learning methods and in particular DNNs.
arXiv Detail & Related papers (2020-08-21T09:12:52Z)
Increasing Trustworthiness of Deep Neural Networks via Accuracy Monitoring [20.456742449675904]
Inference accuracy of deep neural networks (DNNs) is a crucial performance metric, but can vary greatly in practice subject to actual test datasets. This has raised significant concerns with trustworthiness of DNNs, especially in safety-critical applications. We propose a neural network-based accuracy monitor model, which only takes the deployed DNN's softmax probability output as its input.
arXiv Detail & Related papers (2020-07-03T03:09:36Z)
Computing the Testing Error without a Testing Set [33.068870286618655]
We derive an algorithm to estimate the performance gap between training and testing that does not require any testing dataset. This allows us to compute the DNN's testing error on unseen samples, even when we do not have access to them.
arXiv Detail & Related papers (2020-05-01T15:35:50Z)
GraN: An Efficient Gradient-Norm Based Detector for Adversarial and Misclassified Examples [77.99182201815763]
Deep neural networks (DNNs) are vulnerable to adversarial examples and other data perturbations. GraN is a time- and parameter-efficient method that is easily adaptable to any DNN. GraN achieves state-of-the-art performance on numerous problem set-ups.
arXiv Detail & Related papers (2020-04-20T10:09:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.