SETA: Statistical Fault Attribution for Compound AI Systems
- URL: http://arxiv.org/abs/2601.19337v1
- Date: Tue, 27 Jan 2026 08:21:28 GMT
- Title: SETA: Statistical Fault Attribution for Compound AI Systems
- Authors: Sayak Chowdhury, Meenakshi D'Souza,
- Abstract summary: Testing AI systems for robustness and safety entails significant challenges.<n>We propose a modular robustness testing framework that applies a given set of perturbations to test data.<n>We apply the framework to a real-world autonomous rail inspection system composed of multiple deep networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern AI systems increasingly comprise multiple interconnected neural networks to tackle complex inference tasks. Testing such systems for robustness and safety entails significant challenges. Current state-of-the-art robustness testing techniques, whether black-box or white-box, have been proposed and implemented for single-network models and do not scale well to multi-network pipelines. We propose a modular robustness testing framework that applies a given set of perturbations to test data. Our testing framework supports (1) a component-wise system analysis to isolate errors and (2) reasoning about error propagation across the neural network modules. The testing framework is architecture and modality agnostic and can be applied across domains. We apply the framework to a real-world autonomous rail inspection system composed of multiple deep networks and successfully demonstrate how our approach enables fine-grained robustness analysis beyond conventional end-to-end metrics.
Related papers
- Multi-Agent Collaborative Intrusion Detection for Low-Altitude Economy IoT: An LLM-Enhanced Agentic AI Framework [60.72591149679355]
The rapid expansion of low-altitude economy Internet of Things (LAE-IoT) networks has created unprecedented security challenges.<n>Traditional intrusion detection systems fail to tackle the unique characteristics of aerial IoT environments.<n>We introduce a large language model (LLM)-enabled agentic AI framework for enhancing intrusion detection in LAE-IoT networks.
arXiv Detail & Related papers (2026-01-25T12:47:25Z) - Demystifying deep search: a holistic evaluation with hint-free multi-hop questions and factorised metrics [89.1999907891494]
We present WebDetective, a benchmark of hint-free multi-hop questions paired with a controlled Wikipedia sandbox.<n>Our evaluation of 25 state-of-the-art models reveals systematic weaknesses across all architectures.<n>We develop an agentic workflow, EvidenceLoop, that explicitly targets the challenges our benchmark identifies.
arXiv Detail & Related papers (2025-10-01T07:59:03Z) - AI-Compass: A Comprehensive and Effective Multi-module Testing Tool for AI Systems [26.605694684145313]
In this study, we design and implement a testing tool, tool, to comprehensively and effectively evaluate AI systems.
The tool extensively assesses adversarial robustness, model interpretability, and performs neuron analysis.
Our research sheds light on a general solution for AI systems testing landscape.
arXiv Detail & Related papers (2024-11-09T11:15:17Z) - Distributionally Robust Statistical Verification with Imprecise Neural Networks [3.9456691693452552]
A particularly challenging problem in AI safety is providing guarantees on the behavior of high-dimensional autonomous systems.<n>This paper proposes a novel approach based on uncertainty quantification using concepts from probabilities.<n>We show that our approach can provide useful and scalable guarantees for high-dimensional systems.
arXiv Detail & Related papers (2023-08-28T18:06:24Z) - MMRNet: Improving Reliability for Multimodal Object Detection and
Segmentation for Bin Picking via Multimodal Redundancy [68.7563053122698]
We propose a reliable object detection and segmentation system with MultiModal Redundancy (MMRNet)
This is the first system that introduces the concept of multimodal redundancy to address sensor failure issues during deployment.
We present a new label-free multi-modal consistency (MC) score that utilizes the output from all modalities to measure the overall system output reliability and uncertainty.
arXiv Detail & Related papers (2022-10-19T19:15:07Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Increasing the Confidence of Deep Neural Networks by Coverage Analysis [71.57324258813674]
This paper presents a lightweight monitoring architecture based on coverage paradigms to enhance the model against different unsafe inputs.
Experimental results show that the proposed approach is effective in detecting both powerful adversarial examples and out-of-distribution inputs.
arXiv Detail & Related papers (2021-01-28T16:38:26Z) - Reliability Validation of Learning Enabled Vehicle Tracking [13.642244802899471]
This paper studies the reliability of a real-world learning-enabled system, which conducts dynamic vehicle tracking.
It is known that neural networks suffer from adversarial examples, which make them lack robustness.
By integrating a coverage-guided neural network testing tool, DeepConcolic, with the vehicle tracking system, we found that the overall system can be resilient to some adversarial examples.
arXiv Detail & Related papers (2020-02-06T18:07:54Z) - Deep Learning-Based Intrusion Detection System for Advanced Metering
Infrastructure [0.0]
The smart grid is exposed to a wide variety of threats that could be translated into cyber-attacks.
In this paper, we develop a deep learning-based intrusion detection system to defend against cyber-attacks.
arXiv Detail & Related papers (2019-12-31T21:06:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.