"Where is My Troubleshooting Procedure?": Studying the Potential of RAG in Assisting Failure Resolution of Large Cyber-Physical System
- URL: http://arxiv.org/abs/2601.08706v2
- Date: Wed, 14 Jan 2026 07:28:56 GMT
- Title: "Where is My Troubleshooting Procedure?": Studying the Potential of RAG in Assisting Failure Resolution of Large Cyber-Physical System
- Authors: Maria Teresa Rossi, Leonardo Mariani, Oliviero Riganelli, Giuseppe Filomento, Danilo Giannone, Paolo Gavazzo,
- Abstract summary: Retrieval Augmented Generation (RAG) enables the development of tools that can assist operators in their retrieval tasks.<n>This paper presents the results of a set of experiments that derive from the analysis of the troubleshooting procedures available in Fincantieri.
- Score: 4.084837084015297
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In today's complex industrial environments, operators must often navigate through extensive technical manuals to identify troubleshooting procedures that may help react to some observed failure symptoms. These manuals, written in natural language, describe many steps in detail. Unfortunately, the number, magnitude, and articulation of these descriptions can significantly slow down and complicate the retrieval of the correct procedure during critical incidents. Interestingly, Retrieval Augmented Generation (RAG) enables the development of tools based on conversational interfaces that can assist operators in their retrieval tasks, improving their capability to respond to incidents. This paper presents the results of a set of experiments that derive from the analysis of the troubleshooting procedures available in Fincantieri, a large international company developing complex naval cyber-physical systems. Results show that RAG can assist operators in reacting promptly to failure symptoms, although specific measures have to be taken into consideration to cross-validate recommendations before actuating them.
Related papers
- Steering LLMs via Scalable Interactive Oversight [74.12746881843044]
Large Language Models increasingly automate complex, long-horizon tasks such as emphvibe coding, a supervision gap has emerged.<n>It presents a critical challenge in scalable oversight: enabling humans to responsibly steer AI systems on tasks that surpass their own ability to specify or verify.
arXiv Detail & Related papers (2026-02-04T04:52:00Z) - Enhancing Retrieval Augmentation via Adversarial Collaboration [50.117273835877334]
We propose the Adrial Collaboration RAG (AC-RAG) framework to address "Retrieval Hallucinations"<n>AC-RAG employs two heterogeneous agents: a generalist Detector that identifies knowledge gaps, and a domain-specialized Resolver that provides precise solutions.<n>Experiments show that AC-RAG significantly improves retrieval accuracy and outperforms state-of-the-art RAG methods across various vertical domains.
arXiv Detail & Related papers (2025-09-18T08:54:20Z) - AI-Enhanced Operator Assistance for UNICOS Applications [41.99844472131922]
This project explores the development of an AI-enhanced operator assistant for UNICOS, CERN's UNified Industrial Control System.<n>Preliminary evaluations suggest that the system is capable of decoding widgets, performing root cause analysis, and tracing DPEs across a complex.
arXiv Detail & Related papers (2025-09-16T03:43:54Z) - From PREVENTion to REACTion: Enhancing Failure Resolution in Naval Systems [4.171555557592296]
This paper reports our experience with a state-of-the-art failure prediction method, PREVENT, and its extension with a troubleshooting module, REACT, applied to naval systems developed by Fincantieri.<n>We conclude by discussing a lesson learned, which may help deploy and extend these analyses to other industrial products.
arXiv Detail & Related papers (2025-08-21T13:57:14Z) - SOPBench: Evaluating Language Agents at Following Standard Operating Procedures and Constraints [59.645885492637845]
SOPBench is an evaluation pipeline that transforms each service-specific SOP code program into a directed graph of executable functions.<n>Our approach transforms each service-specific SOP code program into a directed graph of executable functions and requires agents to call these functions based on natural language SOP descriptions.<n>We evaluate 18 leading models, and results show the task is challenging even for top-tier models.
arXiv Detail & Related papers (2025-03-11T17:53:02Z) - Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.<n>Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.<n>We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z) - SafeLLM: Domain-Specific Safety Monitoring for Large Language Models: A Case Study of Offshore Wind Maintenance [0.6116681488656472]
This paper introduces an innovative approach to tackle this challenge by capitalising on Large Language Models (LLMs)
We present a specialised conversational agent that incorporates statistical techniques to calculate distances between sentences for the detection and filtering of hallucinations and unsafe output.
arXiv Detail & Related papers (2024-10-06T13:00:53Z) - Exploring LLM-based Agents for Root Cause Analysis [17.053079105858497]
Root cause analysis (RCA) is a critical part of the incident management process.
Large Language Models (LLMs) have been used to perform RCA, but are not able to collect additional diagnostic information.
We present an evaluation of a ReAct agent equipped with retrieval tools, on an out-of-distribution dataset of production incidents collected at Microsoft.
arXiv Detail & Related papers (2024-03-07T00:44:01Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Inspect, Understand, Overcome: A Survey of Practical Methods for AI
Safety [54.478842696269304]
The use of deep neural networks (DNNs) in safety-critical applications is challenging due to numerous model-inherent shortcomings.
In recent years, a zoo of state-of-the-art techniques aiming to address these safety concerns has emerged.
Our paper addresses both machine learning experts and safety engineers.
arXiv Detail & Related papers (2021-04-29T09:54:54Z) - Explainable AI for Robot Failures: Generating Explanations that Improve
User Assistance in Fault Recovery [19.56670862587773]
We introduce a new type of explanation, that explains the cause of an unexpected failure during an agent's plan execution to non-experts.
We investigate how such explanations can be autonomously generated, extending an existing encoder-decoder model.
arXiv Detail & Related papers (2021-01-05T16:16:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.