Towards Continuous Assurance with Formal Verification and Assurance Cases
- URL: http://arxiv.org/abs/2511.14805v1
- Date: Mon, 17 Nov 2025 18:42:38 GMT
- Title: Towards Continuous Assurance with Formal Verification and Assurance Cases
- Authors: Dhaminda B. Abeywickrama, Michael Fisher, Frederic Wheeler, Louise Dennis,
- Abstract summary: We propose a unified Continuous Assurance Framework that integrates design-time, runtime, and evolution-time assurance within a traceable, model-driven workflow.<n>We demonstrate our approach on a nuclear inspection robot scenario, and discuss its alignment with the Trilateral AI Principles.
- Score: 0.5483130283061118
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous systems must sustain justified confidence in their correctness and safety across their operational lifecycle-from design and deployment through post-deployment evolution. Traditional assurance methods often separate development-time assurance from runtime assurance, yielding fragmented arguments that cannot adapt to runtime changes or system updates - a significant challenge for assured autonomy. Towards addressing this, we propose a unified Continuous Assurance Framework that integrates design-time, runtime, and evolution-time assurance within a traceable, model-driven workflow as a step towards assured autonomy. In this paper, we specifically instantiate the design-time phase of the framework using two formal verification methods: RoboChart for functional correctness and PRISM for probabilistic risk analysis. We also propose a model-driven transformation pipeline, implemented as an Eclipse plugin, that automatically regenerates structured assurance arguments whenever formal specifications or their verification results change, thereby ensuring traceability. We demonstrate our approach on a nuclear inspection robot scenario, and discuss its alignment with the Trilateral AI Principles, reflecting regulator-endorsed best practices.
Related papers
- Agentic Problem Frames: A Systematic Approach to Engineering Reliable Domain Agents [0.0]
Large Language Models (LLMs) are evolving into autonomous agents, yet current "frameless" development--relying on ambiguous natural language--leads to critical risks such as scope creep and open-loop failures.<n>This study proposes Agentic Problem Frames (APF), a systematic engineering framework that shifts focus from internal model intelligence to the structured interaction between the agent and its environment.
arXiv Detail & Related papers (2026-02-22T06:32:32Z) - The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies [57.387081435669835]
Multi-agent systems built from large language models offer a promising paradigm for scalable collective intelligence and self-evolution.<n>We show that an agent society satisfying continuous self-evolution, complete isolation, and safety invariance is impossible.<n>We propose several solution directions to alleviate the identified safety concern.
arXiv Detail & Related papers (2026-02-10T15:18:19Z) - Towards Verifiably Safe Tool Use for LLM Agents [53.55621104327779]
Large language model (LLM)-based AI agents extend capabilities by enabling access to tools such as data sources, APIs, search engines, code sandboxes, and even other agents.<n>LLMs may invoke unintended tool interactions and introduce risks, such as leaking sensitive data or overwriting critical records.<n>Current approaches to mitigate these risks, such as model-based safeguards, enhance agents' reliability but cannot guarantee system safety.
arXiv Detail & Related papers (2026-01-12T21:31:38Z) - Assured Autonomy: How Operations Research Powers and Orchestrates Generative AI Systems [18.881800772626427]
We argue generative models can be fragile in operational domains unless paired with mechanisms that provide feasibility, robustness to distribution shift, and stress testing.<n>We develop a conceptual framework for assured autonomy grounded in operations research.<n>These elements define a research agenda for assured autonomy in safety-critical, reliability-sensitive operational domains.
arXiv Detail & Related papers (2025-12-30T04:24:06Z) - RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic [56.38397499463889]
Embodied agents powered by vision-language models (VLMs) are increasingly capable of executing complex real-world tasks.<n>However, they remain vulnerable to hazardous instructions that may trigger unsafe behaviors.<n>We propose RoboSafe, a runtime safeguard for embodied agents through executable predicate-based safety logic.
arXiv Detail & Related papers (2025-12-24T15:01:26Z) - A Scalable Framework for Safety Assurance of Self-Driving Vehicles based on Assurance 2.0 [3.2877342344625777]
Assurance 2.0 is a modern framework developed to address the assurance challenges of increasingly complex, adaptive, and autonomous systems.<n>It introduces reusable assurance theories and explicit counterarguments (defeaters) to enhance rigor, transparency, and adaptability.<n> limitations persist in confidence measurement, residual doubt management, automation support, and the practical handling of defeaters and confirmation bias.
arXiv Detail & Related papers (2025-09-30T16:13:03Z) - AgentGuard: Runtime Verification of AI Agents [1.14219428942199]
AgentGuard is a framework for runtime verification of Agentic AI systems.<n>It provides continuous, quantitative assurance through a new paradigm called Dynamic Probabilistic Assurance.
arXiv Detail & Related papers (2025-09-28T13:08:50Z) - CARE: Decoding Time Safety Alignment via Rollback and Introspection Intervention [68.95008546581339]
Existing decoding-time interventions, such as Contrastive Decoding, often force a severe trade-off between safety and response quality.<n>We propose CARE, a novel framework for decoding-time safety alignment that integrates three key components.<n>The framework achieves a superior balance of safety, quality, and efficiency, attaining a low harmful response rate and minimal disruption to the user experience.
arXiv Detail & Related papers (2025-09-01T04:50:02Z) - Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards [67.86091419220816]
Large Language Models (LLMs) show great promise in complex reasoning.<n>A prevalent issue is superficial self-reflection'', where models fail to robustly verify their own outputs.<n>We introduce RISE (Reinforcing Reasoning with Self-Verification), a novel online RL framework designed to tackle this.
arXiv Detail & Related papers (2025-05-19T17:59:31Z) - Engineering Risk-Aware, Security-by-Design Frameworks for Assurance of Large-Scale Autonomous AI Models [0.0]
This paper presents an enterprise-level, risk-aware, security-by-design approach for large-scale autonomous AI systems.<n>We detail a unified pipeline that delivers provable guarantees of model behavior under adversarial and operational stress.<n>Case studies in national security, open-source model governance, and industrial automation demonstrate measurable reductions in vulnerability and compliance overhead.
arXiv Detail & Related papers (2025-05-09T20:14:53Z) - Know Where You're Uncertain When Planning with Multimodal Foundation Models: A Formal Framework [54.40508478482667]
We present a comprehensive framework to disentangle, quantify, and mitigate uncertainty in perception and plan generation.<n>We propose methods tailored to the unique properties of perception and decision-making.<n>We show that our uncertainty disentanglement framework reduces variability by up to 40% and enhances task success rates by 5% compared to baselines.
arXiv Detail & Related papers (2024-11-03T17:32:00Z) - ACCESS: Assurance Case Centric Engineering of Safety-critical Systems [9.388301205192082]
Assurance cases are used to communicate and assess confidence in critical system properties such as safety and security.
In recent years, model-based system assurance approaches have gained popularity to improve the efficiency and quality of system assurance activities.
We show how model-based system assurance cases can trace to heterogeneous engineering artifacts.
arXiv Detail & Related papers (2024-03-22T14:29:50Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.