Related papers: Neuron Sensitivity Guided Test Case Selection for Deep Learning Testing

Neuron Sensitivity Guided Test Case Selection for Deep Learning Testing

URL: http://arxiv.org/abs/2307.11011v1
Date: Thu, 20 Jul 2023 16:36:04 GMT
Title: Neuron Sensitivity Guided Test Case Selection for Deep Learning Testing
Authors: Dong Huang, Qingwen Bu, Yichao Fu, Yuhao Qing, Bocheng Xiao, Heming Cui
Abstract summary: Deep Neural Networks (DNNs) have been widely deployed in software to address various tasks. DNN developers often collect rich unlabeled datasets from the natural world and label them to test the DNN models. We propose NSS, Neuron Sensitivity guided test case Selection, which can reduce the labeling time by selecting valuable test cases from unlabeled datasets.
Score: 6.686765165569934
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Neural Networks~(DNNs) have been widely deployed in software to address various tasks~(e.g., autonomous driving, medical diagnosis). However, they could also produce incorrect behaviors that result in financial losses and even threaten human safety. To reveal the incorrect behaviors in DNN and repair them, DNN developers often collect rich unlabeled datasets from the natural world and label them to test the DNN models. However, properly labeling a large number of unlabeled datasets is a highly expensive and time-consuming task. To address the above-mentioned problem, we propose NSS, Neuron Sensitivity guided test case Selection, which can reduce the labeling time by selecting valuable test cases from unlabeled datasets. NSS leverages the internal neuron's information induced by test cases to select valuable test cases, which have high confidence in causing the model to behave incorrectly. We evaluate NSS with four widely used datasets and four well-designed DNN models compared to SOTA baseline methods. The results show that NSS performs well in assessing the test cases' probability of fault triggering and model improvement capabilities. Specifically, compared with baseline approaches, NSS obtains a higher fault detection rate~(e.g., when selecting 5\% test case from the unlabeled dataset in MNIST \& LeNet1 experiment, NSS can obtain 81.8\% fault detection rate, 20\% higher than baselines).

Related papers

Latent Space Class Dispersion: Effective Test Data Quality Assessment for DNNs [45.129846925131055]
Latent Space Class Dispersion (LSCD) is a novel metric to quantify the quality of test datasets for Deep Neural Networks (DNNs) Our empirical study shows that LSCD reveals and quantifies deficiencies in the test dataset of three popular benchmarks pertaining to image classification tasks.
arXiv Detail & Related papers (2025-03-24T15:45:50Z)
Feature Map Testing for Deep Neural Networks [6.931570234442819]
We propose DeepFeature, which tests DNNs from the feature map level. DeepFeature has a high fault detection rate and can detect more types of faults(comparing DeepFeature to coverage-guided selection techniques, the fault detection rate is increased by 49.32%)
arXiv Detail & Related papers (2023-07-21T13:15:15Z)
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions. Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z)
DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep Neural Networks [0.6249768559720121]
DeepGD is a black-box multi-objective test selection approach for Deep neural networks (DNNs) It reduces the cost of labeling by prioritizing the selection of test inputs with high fault revealing power from large unlabeled datasets.
arXiv Detail & Related papers (2023-03-08T20:33:09Z)
The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property. We propose a novel approach that returns the exact count of violations. We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z)
Efficient Testing of Deep Neural Networks via Decision Boundary Analysis [28.868479656437145]
We propose a novel technique, named Aries, that can estimate the performance of DNNs on new unlabeled data. The estimated accuracy by Aries is only 0.03% -- 2.60% (on average 0.61%) off the true accuracy.
arXiv Detail & Related papers (2022-07-22T08:39:10Z)
NP-Match: When Neural Processes meet Semi-Supervised Learning [133.009621275051]
Semi-supervised learning (SSL) has been widely explored in recent years, and it is an effective way of leveraging unlabeled data to reduce the reliance on labeled data. In this work, we adjust neural processes (NPs) to the semi-supervised image classification task, resulting in a new method named NP-Match.
arXiv Detail & Related papers (2022-07-03T15:24:31Z)
Labeling-Free Comparison Testing of Deep Learning Models [28.47632100019289]
We propose a labeling-free comparison testing approach to overcome the limitations of labeling effort and sampling randomness. Our approach outperforms the baseline methods by up to 0.74 and 0.53 on Spearman's correlation and Kendall's $tau$, regardless of the dataset and distribution shift.
arXiv Detail & Related papers (2022-04-08T10:55:45Z)
Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training data [52.771780951404565]
Shift-Robust GNN (SR-GNN) is designed to account for distributional differences between biased training data and the graph's true inference distribution. We show that SR-GNN outperforms other GNN baselines by accuracy, eliminating at least (40%) of the negative effects introduced by biased training data.
arXiv Detail & Related papers (2021-08-02T18:00:38Z)
Computing the Testing Error without a Testing Set [33.068870286618655]
We derive an algorithm to estimate the performance gap between training and testing that does not require any testing dataset. This allows us to compute the DNN's testing error on unseen samples, even when we do not have access to them.
arXiv Detail & Related papers (2020-05-01T15:35:50Z)
GraN: An Efficient Gradient-Norm Based Detector for Adversarial and Misclassified Examples [77.99182201815763]
Deep neural networks (DNNs) are vulnerable to adversarial examples and other data perturbations. GraN is a time- and parameter-efficient method that is easily adaptable to any DNN. GraN achieves state-of-the-art performance on numerous problem set-ups.
arXiv Detail & Related papers (2020-04-20T10:09:27Z)
Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification [71.45033077934723]
We incorporate Bayesian neural networks (BNNs) into the deep neural network (DNN) x-vector speaker verification system. With the weight uncertainty modeling provided by BNNs, we expect the system could generalize better on the evaluation data. Results show that the system could benefit from BNNs by a relative EER decrease of 2.66% and 2.32% respectively for short- and long-utterance in-domain evaluations.
arXiv Detail & Related papers (2020-04-08T14:35:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.