Related papers: Coverage-Guided Testing for Deep Learning Models: A Comprehensive Survey

Coverage-Guided Testing for Deep Learning Models: A Comprehensive Survey

URL: http://arxiv.org/abs/2507.00496v1
Date: Tue, 01 Jul 2025 07:12:58 GMT
Title: Coverage-Guided Testing for Deep Learning Models: A Comprehensive Survey
Authors: Hongjing Guo, Chuanqi Tao, Zhiqiu Huang, Weiqin Zou,
Abstract summary: Deep Learning (DL) models are increasingly applied in safety-critical domains.<n>Coverage-guided testing (CGT) has gained prominence as a framework for identifying erroneous or unexpected model behaviors.<n>Existing CGT studies remain methodologically fragmented, limiting the understanding of current advances and emerging trends.
Score: 4.797322346441166
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As Deep Learning (DL) models are increasingly applied in safety-critical domains, ensuring their quality has emerged as a pressing challenge in modern software engineering. Among emerging validation paradigms, coverage-guided testing (CGT) has gained prominence as a systematic framework for identifying erroneous or unexpected model behaviors. Despite growing research attention, existing CGT studies remain methodologically fragmented, limiting the understanding of current advances and emerging trends. This work addresses that gap through a comprehensive review of state-of-the-art CGT methods for DL models, including test coverage analysis, coverage-guided test input generation, and coverage-guided test input optimization. This work provides detailed taxonomies to organize these methods based on methodological characteristics and application scenarios. We also investigate evaluation practices adopted in existing studies, including the use of benchmark datasets, model architectures, and evaluation aspects. Finally, open challenges and future directions are highlighted in terms of the correlation between structural coverage and testing objectives, method generalizability across tasks and models, practical deployment concerns, and the need for standardized evaluation and tool support. This work aims to provide a roadmap for future academic research and engineering practice in DL model quality assurance.

Related papers

AI-Driven Tools in Modern Software Quality Assurance: An Assessment of Benefits, Challenges, and Future Directions [0.0]
The research aims to assess the benefits, challenges, and prospects of integrating modern AI-oriented tools into quality assurance processes.<n>The research demonstrates AI's transformative potential for QA but highlights the importance of a strategic approach to implementing these technologies.
arXiv Detail & Related papers (2025-06-19T20:22:47Z)
Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z)
A Concise Survey on Lane Topology Reasoning for HD Mapping [30.73664953504888]
Lane topology reasoning techniques play a crucial role in high-definition (HD) mapping and autonomous driving applications.<n>Recent years have witnessed significant advances in this field, but there has been limited effort to consolidate these works into a comprehensive overview.<n>This survey systematically reviews the evolution and current state of lane topology reasoning methods.
arXiv Detail & Related papers (2025-03-31T11:30:40Z)
A Survey on Computational Pathology Foundation Models: Datasets, Adaptation Strategies, and Evaluation Tasks [22.806228975730008]
Computational pathology foundation models (CPathFMs) have emerged as a powerful approach for analyzing histological data.<n>These models have demonstrated promise in automating complex pathology tasks such as segmentation, classification, and biomarker discovery.<n>However, the development of CPathFMs presents significant challenges, such as limited data accessibility, high variability across datasets, and lack of standardized evaluation benchmarks.
arXiv Detail & Related papers (2025-01-27T01:27:59Z)
A Survey of Models for Cognitive Diagnosis: New Developments and Future Directions [66.40362209055023]
This paper aims to provide a survey of current models for cognitive diagnosis, with more attention on new developments using machine learning-based methods. By comparing the model structures, parameter estimation algorithms, model evaluation methods and applications, we provide a relatively comprehensive review of the recent trends in cognitive diagnosis models.
arXiv Detail & Related papers (2024-07-07T18:02:00Z)
Position: Quo Vadis, Unsupervised Time Series Anomaly Detection? [11.269007806012931]
The current state of machine learning scholarship in Timeseries Anomaly Detection (TAD) is plagued by the persistent use of flawed evaluation metrics. Our paper presents a critical analysis of the status quo in TAD, revealing the misleading track of current research.
arXiv Detail & Related papers (2024-05-04T14:43:31Z)
Towards a Framework for Deep Learning Certification in Safety-Critical Applications Using Inherently Safe Design and Run-Time Error Detection [0.0]
We consider real-world problems arising in aviation and other safety-critical areas, and investigate their requirements for a certified model. We establish a new framework towards deep learning certification based on (i) inherently safe design, and (ii) run-time error detection.
arXiv Detail & Related papers (2024-03-12T11:38:45Z)
Standardizing Your Training Process for Human Activity Recognition Models: A Comprehensive Review in the Tunable Factors [4.199844472131922]
We provide an exhaustive review of contemporary deep learning research in the field of wearable human activity recognition (WHAR) Our findings suggest that a major trend is the lack of detail provided by model training protocols. With insights from the analyses, we define a novel integrated training procedure tailored to the WHAR model.
arXiv Detail & Related papers (2024-01-10T17:45:28Z)
Geometric Deep Learning for Structure-Based Drug Design: A Survey [83.87489798671155]
Structure-based drug design (SBDD) leverages the three-dimensional geometry of proteins to identify potential drug candidates. Recent advancements in geometric deep learning, which effectively integrate and process 3D geometric data, have significantly propelled the field forward.
arXiv Detail & Related papers (2023-06-20T14:21:58Z)
Position: AI Evaluation Should Learn from How We Test Humans [65.36614996495983]
We argue that psychometrics, a theory originating in the 20th century for human assessment, could be a powerful solution to the challenges in today's AI evaluations.
arXiv Detail & Related papers (2023-06-18T09:54:33Z)
Robustness and Generalization Performance of Deep Learning Models on Cyber-Physical Systems: A Comparative Study [71.84852429039881]
Investigation focuses on the models' ability to handle a range of perturbations, such as sensor faults and noise. We test the generalization and transfer learning capabilities of these models by exposing them to out-of-distribution (OOD) samples.
arXiv Detail & Related papers (2023-06-13T12:43:59Z)
A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts [117.72709110877939]
Test-time adaptation (TTA) has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.<n>We categorize TTA into several distinct groups based on the form of test data, namely, test-time domain adaptation, test-time batch adaptation, and online test-time adaptation.
arXiv Detail & Related papers (2023-03-27T16:32:21Z)
GLUECons: A Generic Benchmark for Learning Under Constraints [102.78051169725455]
In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision. We model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints.
arXiv Detail & Related papers (2023-02-16T16:45:36Z)
Self-Supervised Anomaly Detection in Computer Vision and Beyond: A Survey and Outlook [9.85256783464329]
Anomaly detection plays a crucial role in various domains, including cybersecurity, finance, and healthcare. In recent years, significant progress has been made in this field due to the remarkable growth of deep learning models. The advent of self-supervised learning has sparked the development of novel AD algorithms that outperform the existing state-of-the-art approaches.
arXiv Detail & Related papers (2022-05-10T21:16:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.