Towards Assessing Deep Learning Test Input Generators
- URL: http://arxiv.org/abs/2504.02329v2
- Date: Mon, 07 Apr 2025 18:35:13 GMT
- Title: Towards Assessing Deep Learning Test Input Generators
- Authors: Seif Mzoughi, Ahmed Haj yahmed, Mohamed Elshafei, Foutse Khomh, Diego Elias Costa,
- Abstract summary: This paper presents a comprehensive assessment of four state-of-the-art Test Input Generators (TIGs)<n>Our findings reveal important trade-offs in robustness revealing capability, variation in test case generation, and computational efficiency across TIGs.<n>This paper offers practical guidance for selecting appropriate TIGs aligned with specific objectives and dataset characteristics.
- Score: 10.05882029297834
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Learning (DL) systems are increasingly deployed in safety-critical applications, yet they remain vulnerable to robustness issues that can lead to significant failures. While numerous Test Input Generators (TIGs) have been developed to evaluate DL robustness, a comprehensive assessment of their effectiveness across different dimensions is still lacking. This paper presents a comprehensive assessment of four state-of-the-art TIGs--DeepHunter, DeepFault, AdvGAN, and SinVAD--across multiple critical aspects: fault-revealing capability, naturalness, diversity, and efficiency. Our empirical study leverages three pre-trained models (LeNet-5, VGG16, and EfficientNetB3) on datasets of varying complexity (MNIST, CIFAR-10, and ImageNet-1K) to evaluate TIG performance. Our findings reveal important trade-offs in robustness revealing capability, variation in test case generation, and computational efficiency across TIGs. The results also show that TIG performance varies significantly with dataset complexity, as tools that perform well on simpler datasets may struggle with more complex ones. In contrast, others maintain steadier performance or better scalability. This paper offers practical guidance for selecting appropriate TIGs aligned with specific objectives and dataset characteristics. Nonetheless, more work is needed to address TIG limitations and advance TIGs for real-world, safety-critical systems.
Related papers
- Benchmarking Generative AI Models for Deep Learning Test Input Generation [6.674615464230326]
Test Input Generators (TIGs) are crucial to assess the ability of Deep Learning (DL) image classifiers to provide correct predictions for inputs beyond their training and test sets.<n>Recent advancements in Generative AI (GenAI) models have made them a powerful tool for creating and manipulating synthetic images.<n>We benchmark and combine different GenAI models with TIGs, assessing their effectiveness, efficiency, and quality of the generated test images.
arXiv Detail & Related papers (2024-12-23T15:30:42Z) - Data Quality Issues in Vulnerability Detection Datasets [1.6114012813668932]
Vulnerability detection is a crucial yet challenging task to identify potential weaknesses in software for cyber security.
Deep learning (DL) has made great progress in automating the detection process.
Many datasets have been created to train DL models for this purpose.
However, these datasets suffer from several issues that will lead to low detection accuracy of DL models.
arXiv Detail & Related papers (2024-10-08T13:31:29Z) - Generative Adversarial Networks for Imputing Sparse Learning Performance [3.0350058108125646]
This paper proposes using the Generative Adversarial Imputation Networks (GAIN) framework to impute sparse learning performance data.
Our customized GAIN-based method computational process imputes sparse data in a 3D tensor space.
This finding enhances comprehensive learning data modeling and analytics in AI-based education.
arXiv Detail & Related papers (2024-07-26T17:09:48Z) - Improving GBDT Performance on Imbalanced Datasets: An Empirical Study of Class-Balanced Loss Functions [3.559225731091162]
This paper presents the first comprehensive study on adapting class-balanced loss functions to three Gradient Boosting Decision Trees (GBDT) algorithms.
We conduct extensive experiments on multiple datasets to evaluate the impact of class-balanced losses on different GBDT models.
Our results demonstrate the potential of class-balanced loss functions to enhance GBDT performance on imbalanced datasets.
arXiv Detail & Related papers (2024-07-19T15:10:46Z) - Generalized Face Forgery Detection via Adaptive Learning for Pre-trained Vision Transformer [54.32283739486781]
We present a textbfForgery-aware textbfAdaptive textbfVision textbfTransformer (FA-ViT) under the adaptive learning paradigm.
FA-ViT achieves 93.83% and 78.32% AUC scores on Celeb-DF and DFDC datasets in the cross-dataset evaluation.
arXiv Detail & Related papers (2023-09-20T06:51:11Z) - GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous
Structured Pruning for Vision Transformer [76.2625311630021]
Vision transformers (ViTs) have shown very impressive empirical performance in various computer vision tasks.
To mitigate this challenging problem, structured pruning is a promising solution to compress model size and enable practical efficiency.
We propose GOHSP, a unified framework of Graph and Optimization-based Structured Pruning for ViT models.
arXiv Detail & Related papers (2023-01-13T00:40:24Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - Feature Extraction for Machine Learning-based Intrusion Detection in IoT
Networks [6.6147550436077776]
This paper aims to discover whether Feature Reduction (FR) and Machine Learning (ML) techniques can be generalised across various datasets.
The detection accuracy of three Feature Extraction (FE) algorithms; Principal Component Analysis (PCA), Auto-encoder (AE), and Linear Discriminant Analysis (LDA) is evaluated.
arXiv Detail & Related papers (2021-08-28T23:52:18Z) - Geometry Uncertainty Projection Network for Monocular 3D Object
Detection [138.24798140338095]
We propose a Geometry Uncertainty Projection Network (GUP Net) to tackle the error amplification problem at both inference and training stages.
Specifically, a GUP module is proposed to obtains the geometry-guided uncertainty of the inferred depth.
At the training stage, we propose a Hierarchical Task Learning strategy to reduce the instability caused by error amplification.
arXiv Detail & Related papers (2021-07-29T06:59:07Z) - Vision Transformers are Robust Learners [65.91359312429147]
We study the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples.
We present analyses that provide both quantitative and qualitative indications to explain why ViTs are indeed more robust learners.
arXiv Detail & Related papers (2021-05-17T02:39:22Z) - Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions.
influence estimates are fairly accurate for shallow networks.
Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.