Related papers: DiffGAN: A Test Generation Approach for Differential Testing of Deep Neural Networks

DiffGAN: A Test Generation Approach for Differential Testing of Deep Neural Networks

URL: http://arxiv.org/abs/2410.19794v1
Date: Tue, 15 Oct 2024 23:49:01 GMT
Title: DiffGAN: A Test Generation Approach for Differential Testing of Deep Neural Networks
Authors: Zohreh Aghababaeyan, Manel Abdellatif, Lionel Briand, Ramesh S,
Abstract summary: DiffGAN is a black-box test image generation approach for differential testing of Deep Neural Networks (DNNs) It generates diverse and valid triggering inputs that reveal behavioral discrepancies between models. Our results show DiffGAN significantly outperforms a SOTA baseline, generating four times more triggering inputs, with greater diversity and validity, within the same budget.
Score: 0.30292136896203486
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Neural Networks (DNNs) are increasingly deployed across applications. However, ensuring their reliability remains a challenge, and in many situations, alternative models with similar functionality and accuracy are available. Traditional accuracy-based evaluations often fail to capture behavioral differences between models, especially with limited test datasets, making it difficult to select or combine models effectively. Differential testing addresses this by generating test inputs that expose discrepancies in DNN model behavior. However, existing approaches face significant limitations: many rely on model internals or are constrained by available seed inputs. To address these challenges, we propose DiffGAN, a black-box test image generation approach for differential testing of DNN models. DiffGAN leverages a Generative Adversarial Network (GAN) and the Non-dominated Sorting Genetic Algorithm II to generate diverse and valid triggering inputs that reveal behavioral discrepancies between models. DiffGAN employs two custom fitness functions, focusing on diversity and divergence, to guide the exploration of the GAN input space and identify discrepancies between models' outputs. By strategically searching this space, DiffGAN generates inputs with specific features that trigger differences in model behavior. DiffGAN is black-box, making it applicable in more situations. We evaluate DiffGAN on eight DNN model pairs trained on widely used image datasets. Our results show DiffGAN significantly outperforms a SOTA baseline, generating four times more triggering inputs, with greater diversity and validity, within the same budget. Additionally, the generated inputs improve the accuracy of a machine learning-based model selection mechanism, which selects the best-performing model based on input characteristics and can serve as a smart output voting mechanism when using alternative models.

Related papers

Deep Probabilistic Modeling of User Behavior for Anomaly Detection via Mixture Density Networks [1.4993227168009349]
This paper proposes an anomaly detection method based on a deep mixture density network.<n>It effectively captures the multimodal distribution characteristics commonly present in behavioral data.<n> Experiments are conducted on the real-world network user dataset UNSW-NB15.
arXiv Detail & Related papers (2025-05-13T04:32:21Z)
BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND) Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model. Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z)
Diffusion-TTA: Test-time Adaptation of Discriminative Models via Generative Feedback [97.0874638345205]
generative models can be great test-time adapters for discriminative models. Our method, Diffusion-TTA, adapts pre-trained discriminative models to each unlabelled example in the test set. We show Diffusion-TTA significantly enhances the accuracy of various large-scale pre-trained discriminative models.
arXiv Detail & Related papers (2023-11-27T18:59:53Z)
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces. We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z)
Improving Out-of-Distribution Robustness of Classifiers via Generative Interpolation [56.620403243640396]
Deep neural networks achieve superior performance for learning from independent and identically distributed (i.i.d.) data. However, their performance deteriorates significantly when handling out-of-distribution (OoD) data. We develop a simple yet effective method called Generative Interpolation to fuse generative models trained from multiple domains for synthesizing diverse OoD samples.
arXiv Detail & Related papers (2023-07-23T03:53:53Z)
DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion [2.458437232470188]
Class-conditional image generation using generative adversarial networks (GANs) has been investigated through various techniques. We propose a novel approach for class-conditional image generation using GANs called DuDGAN, which incorporates a dual diffusion-based noise injection process. Our method outperforms state-of-the-art conditional GAN models for image generation in terms of performance.
arXiv Detail & Related papers (2023-05-24T07:59:44Z)
A Statistical-Modelling Approach to Feedforward Neural Network Model Selection [0.8287206589886881]
Feedforward neural networks (FNNs) can be viewed as non-linear regression models. A novel model selection method is proposed using the Bayesian information criterion (BIC) for FNNs. The choice of BIC over out-of-sample performance leads to an increased probability of recovering the true model.
arXiv Detail & Related papers (2022-07-09T11:07:04Z)
Characterizing and Understanding the Behavior of Quantized Models for Reliable Deployment [32.01355605506855]
Quantization-aware training can produce more stable models than standard, adversarial, and Mixup training. Disagreements often have closer top-1 and top-2 output probabilities, and $Margin$ is a better indicator than the other uncertainty metrics to distinguish disagreements. We opensource our code and models as a new benchmark for further studying the quantized models.
arXiv Detail & Related papers (2022-04-08T11:19:16Z)
Labeling-Free Comparison Testing of Deep Learning Models [28.47632100019289]
We propose a labeling-free comparison testing approach to overcome the limitations of labeling effort and sampling randomness. Our approach outperforms the baseline methods by up to 0.74 and 0.53 on Spearman's correlation and Kendall's $tau$, regardless of the dataset and distribution shift.
arXiv Detail & Related papers (2022-04-08T10:55:45Z)
on the effectiveness of generative adversarial network on anomaly detection [1.6244541005112747]
GANs rely on the rich contextual information of these models to identify the actual training distribution. We suggest a new unsupervised model based on GANs --a combination of an autoencoder and a GAN. A new scoring function was introduced to target anomalies where a linear combination of the internal representation of the discriminator and the generator's visual representation, plus the encoded representation of the autoencoder, come together to define the proposed anomaly score.
arXiv Detail & Related papers (2021-12-31T16:35:47Z)
MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness. Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions. We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z)
Selecting Treatment Effects Models for Domain Adaptation Using Causal Knowledge [82.5462771088607]
We propose a novel model selection metric specifically designed for ITE methods under the unsupervised domain adaptation setting. In particular, we propose selecting models whose predictions of interventions' effects satisfy known causal structures in the target domain.
arXiv Detail & Related papers (2021-02-11T21:03:14Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.