Related papers: From PREVENTion to REACTion: Enhancing Failure Resolution in Naval Systems

Related papers

Wink: Recovering from Misbehaviors in Coding Agents [6.794419834325995]
Autonomous coding agents are increasingly being adopted in the software industry to automate complex engineering tasks.<n>These agents are prone to a wide range of misbehaviors, such as deviating from the user's instructions, getting stuck in repetitive loops, or failing to use tools correctly.<n>We present a system for automatically recovering from agentic misbehaviors at scale.
arXiv Detail & Related papers (2026-02-19T03:15:00Z)
Why Does the LLM Stop Computing: An Empirical Study of User-Reported Failures in Open-Source LLMs [50.075587392477935]
We conduct the first large-scale empirical study of 705 real-world failures from the open-source DeepSeek, Llama, and Qwen ecosystems.<n>Our analysis reveals a paradigm shift: white-box orchestration relocates the reliability bottleneck from model algorithmic defects to the systemic fragility of the deployment stack.
arXiv Detail & Related papers (2026-01-20T06:42:56Z)
Combining SHAP and Causal Analysis for Interpretable Fault Detection in Industrial Processes [1.924423011183876]
This study tackles such difficulties using the Tennessee Eastman Process, a well-established benchmark known for its intricate dynamics.<n>We transform the problem into a more manageable and transparent form, pinpointing the most critical process features driving fault predictions.<n>The resulting causal structures align strikingly with SHAP findings, consistently highlighting key process elements-like cooling and separation systems-as pivotal to fault development.
arXiv Detail & Related papers (2025-10-27T19:56:46Z)
Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution Reconstruction [58.51530390018909]
Large Language Model based multi-agent systems excel at collaborative problem solving but remain brittle to cascading errors.<n>We present MASC, a metacognitive framework that endows MAS with real-time, unsupervised, step-level error detection and self-correction.
arXiv Detail & Related papers (2025-10-16T05:35:37Z)
Confounding is a Pervasive Problem in Real World Recommender Systems [84.09696908897168]
Unobserved confounding undermines observational studies in fields like economics, medicine, ecology or epidemiology.<n>This paper will show that numerous common practices such as feature engineering, A/B testing and modularization can in fact introduce confounding into recommendation systems.
arXiv Detail & Related papers (2025-08-14T09:31:35Z)
A Survey on AgentOps: Categorization, Challenges, and Future Directions [25.00082531560766]
This paper introduces a novel and comprehensive operational framework for agent systems, dubbed Agent System Operations (AgentOps)<n>We provide detailed definitions and explanations of its four key stages: monitoring, anomaly detection, root cause analysis, and resolution.
arXiv Detail & Related papers (2025-08-04T06:59:36Z)
RealHarm: A Collection of Real-World Language Model Application Failures [1.2820953788225848]
We introduce RealHarm, a dataset of annotated problematic interactions with AI agents.<n>We analyze harms, causes, and hazards specifically from the deployer's perspective.<n>We evaluate state-of-the-art guardrails and content moderation systems to probe whether such systems would have prevented the incidents.
arXiv Detail & Related papers (2025-04-14T14:44:41Z)
On the Fly Detection of Root Causes from Observed Data with Application to IT Systems [3.3321350585823826]
This paper introduces a new structural causal model tailored for representing threshold-based IT systems. It presents a new algorithm designed to rapidly detect root causes of anomalies in such systems.
arXiv Detail & Related papers (2024-02-09T16:10:19Z)
Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification. We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations. Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z)
Supporting Early-Safety Analysis of IoT Systems by Exploiting Testing Techniques [9.095386349136717]
FailureLogic Analysis FLA is a technique that helps predict potential failure scenarios. manually specifying FLA rules can be arduous and errorprone leading to incomplete or inaccurate specifications. We propose adopting testing methodologies to improve the completeness and correctness of these rules.
arXiv Detail & Related papers (2023-09-06T13:32:39Z)
Causal Disentanglement Hidden Markov Model for Fault Diagnosis [55.90917958154425]
We propose a Causal Disentanglement Hidden Markov model (CDHM) to learn the causality in the bearing fault mechanism. Specifically, we make full use of the time-series data and progressively disentangle the vibration signal into fault-relevant and fault-irrelevant factors. To expand the scope of the application, we adopt unsupervised domain adaptation to transfer the learned disentangled representations to other working environments.
arXiv Detail & Related papers (2023-08-06T05:58:45Z)
Interactive System-wise Anomaly Detection [66.3766756452743]
Anomaly detection plays a fundamental role in various applications. It is challenging for existing methods to handle the scenarios where the instances are systems whose characteristics are not readily observed as data. We develop an end-to-end approach which includes an encoder-decoder module that learns system embeddings.
arXiv Detail & Related papers (2023-04-21T02:20:24Z)
Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data. We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism. We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z)
Mining Root Cause Knowledge from Cloud Service Incident Investigations for AIOps [71.12026848664753]
Root Cause Analysis (RCA) of any service-disrupting incident is one of the most critical as well as complex tasks in IT processes. In this work, we present ICA and the downstream Incident Search and Retrieval based RCA pipeline, built at Salesforce.
arXiv Detail & Related papers (2022-04-21T02:33:34Z)
Discovering and Validating AI Errors With Crowdsourced Failure Reports [10.4818618376202]
We introduce crowdsourced failure reports, end-user descriptions of how or why a model failed, and show how developers can use them to detect AI errors. We also design and implement Deblinder, a visual analytics system for synthesizing failure reports. In semi-structured interviews and think-aloud studies with 10 AI practitioners, we explore the affordances of the Deblinder system and the applicability of failure reports in real-world settings.
arXiv Detail & Related papers (2021-09-23T23:26:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.