Related papers: Towards Continuous Assurance with Formal Verification and Assurance Cases

Towards Continuous Assurance with Formal Verification and Assurance Cases

URL: http://arxiv.org/abs/2511.14805v1
Date: Mon, 17 Nov 2025 18:42:38 GMT
Title: Towards Continuous Assurance with Formal Verification and Assurance Cases
Authors: Dhaminda B. Abeywickrama, Michael Fisher, Frederic Wheeler, Louise Dennis,
Abstract summary: We propose a unified Continuous Assurance Framework that integrates design-time, runtime, and evolution-time assurance within a traceable, model-driven workflow.<n>We demonstrate our approach on a nuclear inspection robot scenario, and discuss its alignment with the Trilateral AI Principles.
Score: 0.5483130283061118
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Autonomous systems must sustain justified confidence in their correctness and safety across their operational lifecycle-from design and deployment through post-deployment evolution. Traditional assurance methods often separate development-time assurance from runtime assurance, yielding fragmented arguments that cannot adapt to runtime changes or system updates - a significant challenge for assured autonomy. Towards addressing this, we propose a unified Continuous Assurance Framework that integrates design-time, runtime, and evolution-time assurance within a traceable, model-driven workflow as a step towards assured autonomy. In this paper, we specifically instantiate the design-time phase of the framework using two formal verification methods: RoboChart for functional correctness and PRISM for probabilistic risk analysis. We also propose a model-driven transformation pipeline, implemented as an Eclipse plugin, that automatically regenerates structured assurance arguments whenever formal specifications or their verification results change, thereby ensuring traceability. We demonstrate our approach on a nuclear inspection robot scenario, and discuss its alignment with the Trilateral AI Principles, reflecting regulator-endorsed best practices.

Related papers

Agentic Problem Frames: A Systematic Approach to Engineering Reliable Domain Agents [0.0]
Large Language Models (LLMs) are evolving into autonomous agents, yet current "frameless" development--relying on ambiguous natural language--leads to critical risks such as scope creep and open-loop failures.<n>This study proposes Agentic Problem Frames (APF), a systematic engineering framework that shifts focus from internal model intelligence to the structured interaction between the agent and its environment.
arXiv Detail & Related papers (2026-02-22T06:32:32Z)
The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies [57.387081435669835]
Multi-agent systems built from large language models offer a promising paradigm for scalable collective intelligence and self-evolution.<n>We show that an agent society satisfying continuous self-evolution, complete isolation, and safety invariance is impossible.<n>We propose several solution directions to alleviate the identified safety concern.
arXiv Detail & Related papers (2026-02-10T15:18:19Z)
Towards Verifiably Safe Tool Use for LLM Agents [53.55621104327779]
Large language model (LLM)-based AI agents extend capabilities by enabling access to tools such as data sources, APIs, search engines, code sandboxes, and even other agents.<n>LLMs may invoke unintended tool interactions and introduce risks, such as leaking sensitive data or overwriting critical records.<n>Current approaches to mitigate these risks, such as model-based safeguards, enhance agents' reliability but cannot guarantee system safety.
arXiv Detail & Related papers (2026-01-12T21:31:38Z)
Assured Autonomy: How Operations Research Powers and Orchestrates Generative AI Systems [18.881800772626427]
We argue generative models can be fragile in operational domains unless paired with mechanisms that provide feasibility, robustness to distribution shift, and stress testing.<n>We develop a conceptual framework for assured autonomy grounded in operations research.<n>These elements define a research agenda for assured autonomy in safety-critical, reliability-sensitive operational domains.
arXiv Detail & Related papers (2025-12-30T04:24:06Z)
RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic [56.38397499463889]
Embodied agents powered by vision-language models (VLMs) are increasingly capable of executing complex real-world tasks.<n>However, they remain vulnerable to hazardous instructions that may trigger unsafe behaviors.<n>We propose RoboSafe, a runtime safeguard for embodied agents through executable predicate-based safety logic.
arXiv Detail & Related papers (2025-12-24T15:01:26Z)
A Scalable Framework for Safety Assurance of Self-Driving Vehicles based on Assurance 2.0 [3.2877342344625777]
Assurance 2.0 is a modern framework developed to address the assurance challenges of increasingly complex, adaptive, and autonomous systems.<n>It introduces reusable assurance theories and explicit counterarguments (defeaters) to enhance rigor, transparency, and adaptability.<n> limitations persist in confidence measurement, residual doubt management, automation support, and the practical handling of defeaters and confirmation bias.
arXiv Detail & Related papers (2025-09-30T16:13:03Z)
AgentGuard: Runtime Verification of AI Agents [1.14219428942199]
AgentGuard is a framework for runtime verification of Agentic AI systems.<n>It provides continuous, quantitative assurance through a new paradigm called Dynamic Probabilistic Assurance.
arXiv Detail & Related papers (2025-09-28T13:08:50Z)
CARE: Decoding Time Safety Alignment via Rollback and Introspection Intervention [68.95008546581339]
Existing decoding-time interventions, such as Contrastive Decoding, often force a severe trade-off between safety and response quality.<n>We propose CARE, a novel framework for decoding-time safety alignment that integrates three key components.<n>The framework achieves a superior balance of safety, quality, and efficiency, attaining a low harmful response rate and minimal disruption to the user experience.
arXiv Detail & Related papers (2025-09-01T04:50:02Z)
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards [67.86091419220816]
Large Language Models (LLMs) show great promise in complex reasoning.<n>A prevalent issue is superficial self-reflection'', where models fail to robustly verify their own outputs.<n>We introduce RISE (Reinforcing Reasoning with Self-Verification), a novel online RL framework designed to tackle this.
arXiv Detail & Related papers (2025-05-19T17:59:31Z)
Engineering Risk-Aware, Security-by-Design Frameworks for Assurance of Large-Scale Autonomous AI Models [0.0]
This paper presents an enterprise-level, risk-aware, security-by-design approach for large-scale autonomous AI systems.<n>We detail a unified pipeline that delivers provable guarantees of model behavior under adversarial and operational stress.<n>Case studies in national security, open-source model governance, and industrial automation demonstrate measurable reductions in vulnerability and compliance overhead.
arXiv Detail & Related papers (2025-05-09T20:14:53Z)
Know Where You're Uncertain When Planning with Multimodal Foundation Models: A Formal Framework [54.40508478482667]
We present a comprehensive framework to disentangle, quantify, and mitigate uncertainty in perception and plan generation.<n>We propose methods tailored to the unique properties of perception and decision-making.<n>We show that our uncertainty disentanglement framework reduces variability by up to 40% and enhances task success rates by 5% compared to baselines.
arXiv Detail & Related papers (2024-11-03T17:32:00Z)
ACCESS: Assurance Case Centric Engineering of Safety-critical Systems [9.388301205192082]
Assurance cases are used to communicate and assess confidence in critical system properties such as safety and security. In recent years, model-based system assurance approaches have gained popularity to improve the efficiency and quality of system assurance activities. We show how model-based system assurance cases can trace to heterogeneous engineering artifacts.
arXiv Detail & Related papers (2024-03-22T14:29:50Z)
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers. We then present the pointwise feasibility conditions of the resulting safety controller. We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.