Neuron Sensitivity Guided Test Case Selection for Deep Learning Testing
- URL: http://arxiv.org/abs/2307.11011v1
- Date: Thu, 20 Jul 2023 16:36:04 GMT
- Title: Neuron Sensitivity Guided Test Case Selection for Deep Learning Testing
- Authors: Dong Huang, Qingwen Bu, Yichao Fu, Yuhao Qing, Bocheng Xiao, Heming
Cui
- Abstract summary: Deep Neural Networks (DNNs) have been widely deployed in software to address various tasks.
DNN developers often collect rich unlabeled datasets from the natural world and label them to test the DNN models.
We propose NSS, Neuron Sensitivity guided test case Selection, which can reduce the labeling time by selecting valuable test cases from unlabeled datasets.
- Score: 6.686765165569934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks~(DNNs) have been widely deployed in software to address
various tasks~(e.g., autonomous driving, medical diagnosis). However, they
could also produce incorrect behaviors that result in financial losses and even
threaten human safety. To reveal the incorrect behaviors in DNN and repair
them, DNN developers often collect rich unlabeled datasets from the natural
world and label them to test the DNN models. However, properly labeling a large
number of unlabeled datasets is a highly expensive and time-consuming task.
To address the above-mentioned problem, we propose NSS, Neuron Sensitivity
guided test case Selection, which can reduce the labeling time by selecting
valuable test cases from unlabeled datasets. NSS leverages the internal
neuron's information induced by test cases to select valuable test cases, which
have high confidence in causing the model to behave incorrectly. We evaluate
NSS with four widely used datasets and four well-designed DNN models compared
to SOTA baseline methods. The results show that NSS performs well in assessing
the test cases' probability of fault triggering and model improvement
capabilities. Specifically, compared with baseline approaches, NSS obtains a
higher fault detection rate~(e.g., when selecting 5\% test case from the
unlabeled dataset in MNIST \& LeNet1 experiment, NSS can obtain 81.8\% fault
detection rate, 20\% higher than baselines).
Related papers
- Feature Map Testing for Deep Neural Networks [6.931570234442819]
We propose DeepFeature, which tests DNNs from the feature map level.
DeepFeature has a high fault detection rate and can detect more types of faults(comparing DeepFeature to coverage-guided selection techniques, the fault detection rate is increased by 49.32%)
arXiv Detail & Related papers (2023-07-21T13:15:15Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep
Neural Networks [0.6249768559720121]
DeepGD is a black-box multi-objective test selection approach for Deep neural networks (DNNs)
It reduces the cost of labeling by prioritizing the selection of test inputs with high fault revealing power from large unlabeled datasets.
arXiv Detail & Related papers (2023-03-08T20:33:09Z) - The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural
Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property.
We propose a novel approach that returns the exact count of violations.
We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z) - Efficient Testing of Deep Neural Networks via Decision Boundary Analysis [28.868479656437145]
We propose a novel technique, named Aries, that can estimate the performance of DNNs on new unlabeled data.
The estimated accuracy by Aries is only 0.03% -- 2.60% (on average 0.61%) off the true accuracy.
arXiv Detail & Related papers (2022-07-22T08:39:10Z) - NP-Match: When Neural Processes meet Semi-Supervised Learning [133.009621275051]
Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data.
In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match.
arXiv Detail & Related papers (2022-07-03T15:24:31Z) - Labeling-Free Comparison Testing of Deep Learning Models [28.47632100019289]
We propose a labeling-free comparison testing approach to overcome the limitations of labeling effort and sampling randomness.
Our approach outperforms the baseline methods by up to 0.74 and 0.53 on Spearman's correlation and Kendall's $tau$, regardless of the dataset and distribution shift.
arXiv Detail & Related papers (2022-04-08T10:55:45Z) - Shift-Robust GNNs: Overcoming the Limitations of Localized Graph
Training data [52.771780951404565]
Shift-Robust GNN (SR-GNN) is designed to account for distributional differences between biased training data and the graph's true inference distribution.
We show that SR-GNN outperforms other GNN baselines by accuracy, eliminating at least (40%) of the negative effects introduced by biased training data.
arXiv Detail & Related papers (2021-08-02T18:00:38Z) - Computing the Testing Error without a Testing Set [33.068870286618655]
We derive an algorithm to estimate the performance gap between training and testing that does not require any testing dataset.
This allows us to compute the DNN's testing error on unseen samples, even when we do not have access to them.
arXiv Detail & Related papers (2020-05-01T15:35:50Z) - GraN: An Efficient Gradient-Norm Based Detector for Adversarial and
Misclassified Examples [77.99182201815763]
Deep neural networks (DNNs) are vulnerable to adversarial examples and other data perturbations.
GraN is a time- and parameter-efficient method that is easily adaptable to any DNN.
GraN achieves state-of-the-art performance on numerous problem set-ups.
arXiv Detail & Related papers (2020-04-20T10:09:27Z) - Bayesian x-vector: Bayesian Neural Network based x-vector System for
Speaker Verification [71.45033077934723]
We incorporate Bayesian neural networks (BNNs) into the deep neural network (DNN) x-vector speaker verification system.
With the weight uncertainty modeling provided by BNNs, we expect the system could generalize better on the evaluation data.
Results show that the system could benefit from BNNs by a relative EER decrease of 2.66% and 2.32% respectively for short- and long-utterance in-domain evaluations.
arXiv Detail & Related papers (2020-04-08T14:35:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.