Efficient Testing of Deep Neural Networks via Decision Boundary Analysis
- URL: http://arxiv.org/abs/2207.10942v1
- Date: Fri, 22 Jul 2022 08:39:10 GMT
- Title: Efficient Testing of Deep Neural Networks via Decision Boundary Analysis
- Authors: Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike
Papadakis, Yves Le Traon
- Abstract summary: We propose a novel technique, named Aries, that can estimate the performance of DNNs on new unlabeled data.
The estimated accuracy by Aries is only 0.03% -- 2.60% (on average 0.61%) off the true accuracy.
- Score: 28.868479656437145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning plays a more and more important role in our daily life due to
its competitive performance in multiple industrial application domains. As the
core of DL-enabled systems, deep neural networks automatically learn knowledge
from carefully collected and organized training data to gain the ability to
predict the label of unseen data. Similar to the traditional software systems
that need to be comprehensively tested, DNNs also need to be carefully
evaluated to make sure the quality of the trained model meets the demand. In
practice, the de facto standard to assess the quality of DNNs in industry is to
check their performance (accuracy) on a collected set of labeled test data.
However, preparing such labeled data is often not easy partly because of the
huge labeling effort, i.e., data labeling is labor-intensive, especially with
the massive new incoming unlabeled data every day. Recent studies show that
test selection for DNN is a promising direction that tackles this issue by
selecting minimal representative data to label and using these data to assess
the model. However, it still requires human effort and cannot be automatic. In
this paper, we propose a novel technique, named Aries, that can estimate the
performance of DNNs on new unlabeled data using only the information obtained
from the original test data. The key insight behind our technique is that the
model should have similar prediction accuracy on the data which have similar
distances to the decision boundary. We performed a large-scale evaluation of
our technique on 13 types of data transformation methods. The results
demonstrate the usefulness of our technique that the estimated accuracy by
Aries is only 0.03% -- 2.60% (on average 0.61%) off the true accuracy. Besides,
Aries also outperforms the state-of-the-art selection-labeling-based methods in
most (96 out of 128) cases.
Related papers
- ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Labeling-Free Comparison Testing of Deep Learning Models [28.47632100019289]
We propose a labeling-free comparison testing approach to overcome the limitations of labeling effort and sampling randomness.
Our approach outperforms the baseline methods by up to 0.74 and 0.53 on Spearman's correlation and Kendall's $tau$, regardless of the dataset and distribution shift.
arXiv Detail & Related papers (2022-04-08T10:55:45Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - IADA: Iterative Adversarial Data Augmentation Using Formal Verification
and Expert Guidance [1.599072005190786]
We propose an iterative adversarial data augmentation framework to learn neural network models.
The proposed framework is applied to an artificial 2D dataset, the MNIST dataset, and a human motion dataset.
We show that our training method can improve the robustness and accuracy of the learned model.
arXiv Detail & Related papers (2021-08-16T03:05:53Z) - Self-Trained One-class Classification for Unsupervised Anomaly Detection [56.35424872736276]
Anomaly detection (AD) has various applications across domains, from manufacturing to healthcare.
In this work, we focus on unsupervised AD problems whose entire training data are unlabeled and may contain both normal and anomalous samples.
To tackle this problem, we build a robust one-class classification framework via data refinement.
We show that our method outperforms state-of-the-art one-class classification method by 6.3 AUC and 12.5 average precision.
arXiv Detail & Related papers (2021-06-11T01:36:08Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Self-Competitive Neural Networks [0.0]
Deep Neural Networks (DNNs) have improved the accuracy of classification problems in lots of applications.
One of the challenges in training a DNN is its need to be fed by an enriched dataset to increase its accuracy and avoid it suffering from overfitting.
Recently, researchers have worked extensively to propose methods for data augmentation.
In this paper, we generate adversarial samples to refine the Domains of Attraction (DoAs) of each class. In this approach, at each stage, we use the model learned by the primary and generated adversarial data (up to that stage) to manipulate the primary data in a way that look complicated to
arXiv Detail & Related papers (2020-08-22T12:28:35Z) - DEAL: Deep Evidential Active Learning for Image Classification [0.0]
Active Learning (AL) is one approach to mitigate the problem of limited labeled data.
Recent AL methods for CNNs propose different solutions for the selection of instances to be labeled.
We propose a novel AL algorithm that efficiently learns from unlabeled data by capturing high prediction uncertainty.
arXiv Detail & Related papers (2020-07-22T11:14:23Z) - Increasing Trustworthiness of Deep Neural Networks via Accuracy
Monitoring [20.456742449675904]
Inference accuracy of deep neural networks (DNNs) is a crucial performance metric, but can vary greatly in practice subject to actual test datasets.
This has raised significant concerns with trustworthiness of DNNs, especially in safety-critical applications.
We propose a neural network-based accuracy monitor model, which only takes the deployed DNN's softmax probability output as its input.
arXiv Detail & Related papers (2020-07-03T03:09:36Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.