Constrained Co-evolutionary Metamorphic Differential Testing for Autonomous Systems with an Interpretability Approach
- URL: http://arxiv.org/abs/2509.16478v1
- Date: Sat, 20 Sep 2025 00:27:29 GMT
- Title: Constrained Co-evolutionary Metamorphic Differential Testing for Autonomous Systems with an Interpretability Approach
- Authors: Hossein Yousefizadeh, Shenghui Gu, Lionel C. Briand, Ali Nasr,
- Abstract summary: CoCoMagic is a novel test case generation method that combines metamorphic testing, differential testing, and advanced search-based techniques.<n>We show significant improvements over baseline search methods, identifying up to 287% more distinct high-severity behavioral differences.<n>CoCoMagic offers an efficient, effective, and interpretable way for the differential testing of evolving autonomous systems across versions.
- Score: 4.965735630892516
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Autonomous systems, such as autonomous driving systems, evolve rapidly through frequent updates, risking unintended behavioral degradations. Effective system-level testing is challenging due to the vast scenario space, the absence of reliable test oracles, and the need for practically applicable and interpretable test cases. We present CoCoMagic, a novel automated test case generation method that combines metamorphic testing, differential testing, and advanced search-based techniques to identify behavioral divergences between versions of autonomous systems. CoCoMagic formulates test generation as a constrained cooperative co-evolutionary search, evolving both source scenarios and metamorphic perturbations to maximize differences in violations of predefined metamorphic relations across versions. Constraints and population initialization strategies guide the search toward realistic, relevant scenarios. An integrated interpretability approach aids in diagnosing the root causes of divergences. We evaluate CoCoMagic on an end-to-end ADS, InterFuser, within the Carla virtual simulator. Results show significant improvements over baseline search methods, identifying up to 287\% more distinct high-severity behavioral differences while maintaining scenario realism. The interpretability approach provides actionable insights for developers, supporting targeted debugging and safety assessment. CoCoMagic offers an efficient, effective, and interpretable way for the differential testing of evolving autonomous systems across versions.
Related papers
- SelfAI: Building a Self-Training AI System with LLM Agents [79.10991818561907]
SelfAI is a general multi-agent platform that combines a User Agent for translating high-level research objectives into standardized experimental configurations.<n>An Experiment Manager orchestrates parallel, fault-tolerant training across heterogeneous hardware while maintaining a structured knowledge base for continuous feedback.<n>Across regression, computer vision, scientific computing, medical imaging, and drug discovery benchmarks, SelfAI consistently achieves strong performance and reduces redundant trials.
arXiv Detail & Related papers (2025-11-29T09:18:39Z) - Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics [89.1999907891494]
We present WebDetective, a benchmark of hint-free multi-hop questions paired with a controlled Wikipedia sandbox.<n>Our evaluation of 25 state-of-the-art models reveals systematic weaknesses across all architectures.<n>We develop an agentic workflow, EvidenceLoop, that explicitly targets the challenges our benchmark identifies.
arXiv Detail & Related papers (2025-10-01T07:59:03Z) - Uncertainty Awareness on Unsupervised Domain Adaptation for Time Series Data [49.36938105983916]
Unsupervised domain adaptation methods seek to generalize effectively on unlabeled test data.<n>We propose incorporating multi-scale feature extraction and uncertainty estimation to improve the model's generalization and robustness across domains.
arXiv Detail & Related papers (2025-08-26T03:13:08Z) - Behavioral Anomaly Detection in Distributed Systems via Federated Contrastive Learning [0.8906214436849201]
The goal is to overcome the limitations of traditional centralized approaches in terms of data privacy, node heterogeneity, and anomaly pattern recognition.<n>The proposed method combines the distributed collaborative modeling capabilities of federated learning with the feature discrimination enhancement of contrastive learning.<n>It builds embedding representations on local nodes and constructs positive and negative sample pairs to guide the model in learning a more discriminative feature space.
arXiv Detail & Related papers (2025-06-24T02:04:44Z) - Causality-aware Safety Testing for Autonomous Driving Systems [12.138537069776884]
Comprehensive evaluation requires testing across diverse scenarios that can trigger various types of violations under different conditions.<n>We propose a novel Causal-aware fuzzing technique to enable efficient comprehensive testing of Autonomous Driving Systems.
arXiv Detail & Related papers (2025-06-10T10:53:26Z) - A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.<n>We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.<n>By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z) - Using Cooperative Co-evolutionary Search to Generate Metamorphic Test Cases for Autonomous Driving Systems [7.519606883700972]
This paper introduces CoCoMEGA, a novel automated testing framework aimed at advancing system-level safety assessments of Autonomous Driving Systems (ADSs)<n>CoCoMEGA emphasizes the identification of test scenarios that present undesirable system behavior, that may eventually lead to safety violations, captured by Metamorphic Relations (MRs)<n>Future research directions may include extending the approach to additional simulation platforms, applying it to other complex systems, and exploring methods for further improving testing efficiency such as surrogate modeling.
arXiv Detail & Related papers (2024-12-05T03:17:31Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Testing for Fault Diversity in Reinforcement Learning [13.133263651395865]
We argue that policy testing should not find as many failures as possible (e.g., inputs that trigger similar car crashes) but rather aim at revealing as informative and diverse faults as possible in the model.
We show that QD optimisation, while being conceptually simple and generally applicable, finds effectively more diverse faults in the decision model.
arXiv Detail & Related papers (2024-03-22T09:46:30Z) - Discovering Decision Manifolds to Assure Trusted Autonomous Systems [0.0]
We propose an optimization-based search technique for capturing the range of correct and incorrect responses a system could exhibit.
This manifold provides a more detailed understanding of system reliability than traditional testing or Monte Carlo simulations.
In this proof-of-concept, we apply our method to a software-in-the-loop evaluation of an autonomous vehicle.
arXiv Detail & Related papers (2024-02-12T16:55:58Z) - Position: AI Evaluation Should Learn from How We Test Humans [65.36614996495983]
We argue that psychometrics, a theory originating in the 20th century for human assessment, could be a powerful solution to the challenges in today's AI evaluations.
arXiv Detail & Related papers (2023-06-18T09:54:33Z) - Interactive System-wise Anomaly Detection [66.3766756452743]
Anomaly detection plays a fundamental role in various applications.
It is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data.
We develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings.
arXiv Detail & Related papers (2023-04-21T02:20:24Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.