Safety Case Templates for Autonomous Systems
- URL: http://arxiv.org/abs/2102.02625v2
- Date: Thu, 11 Mar 2021 12:50:15 GMT
- Title: Safety Case Templates for Autonomous Systems
- Authors: Robin Bloomfield, Gareth Fletcher, Heidy Khlaaf, Luke Hinde, Philippa
Ryan
- Abstract summary: This report documents safety assurance argument templates to support the deployment and operation of autonomous systems that include machine learning (ML) components.
The report also presents generic templates for argument defeaters and evidence confidence that can be used to strengthen, review, and adapt the templates as necessary.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This report documents safety assurance argument templates to support the
deployment and operation of autonomous systems that include machine learning
(ML) components. The document presents example safety argument templates
covering: the development of safety requirements, hazard analysis, a safety
monitor architecture for an autonomous system including at least one ML
element, a component with ML and the adaptation and change of the system over
time. The report also presents generic templates for argument defeaters and
evidence confidence that can be used to strengthen, review, and adapt the
templates as necessary. This report is made available to get feedback on the
approach and on the templates. This work was sponsored by the UK Dstl under the
R-cloud framework.
Related papers
- Towards Verifiably Safe Tool Use for LLM Agents [53.55621104327779]
Large language model (LLM)-based AI agents extend capabilities by enabling access to tools such as data sources, APIs, search engines, code sandboxes, and even other agents.<n>LLMs may invoke unintended tool interactions and introduce risks, such as leaking sensitive data or overwriting critical records.<n>Current approaches to mitigate these risks, such as model-based safeguards, enhance agents' reliability but cannot guarantee system safety.
arXiv Detail & Related papers (2026-01-12T21:31:38Z) - Monadic Context Engineering [59.95390010097654]
This paper introduces Monadic Context Engineering (MCE) to provide a formal foundation for agent design.<n>We demonstrate how Monads enable robust composition, how Applicatives provide a principled structure for parallel execution, and crucially, how Monad Transformers allow for the systematic composition of these capabilities.<n>This layered approach enables developers to construct complex, resilient, and efficient AI agents from simple, independently verifiable components.
arXiv Detail & Related papers (2025-12-27T01:52:06Z) - Beyond Fixed and Dynamic Prompts: Embedded Jailbreak Templates for Advancing LLM Security [5.187020963919454]
This paper introduces the Embedded Jailbreak template, which preserves the structure of existing templates while naturally embedding harmful queries within their context.<n>We propose a progressive prompt-engineering methodology to ensure template quality and consistency, alongside standardized protocols for generation and evaluation.
arXiv Detail & Related papers (2025-11-18T04:59:10Z) - AI Bill of Materials and Beyond: Systematizing Security Assurance through the AI Risk Scanning (AIRS) Framework [31.261980405052938]
Assurance for artificial intelligence (AI) systems remains fragmented across software supply-chain security, adversarial machine learning, and governance documentation.<n>This paper introduces the AI Risk Scanning (AIRS) Framework, a threat-model-based, evidence-generating framework designed to operationalize AI assurance.
arXiv Detail & Related papers (2025-11-16T16:10:38Z) - Patching LLM Like Software: A Lightweight Method for Improving Safety Policy in Large Language Models [63.54707418559388]
We propose patching for large language models (LLMs) like software versions.<n>Our method enables rapid remediation by prepending a compact, learnable prefix to an existing model.
arXiv Detail & Related papers (2025-11-11T17:25:44Z) - Automating Steering for Safe Multimodal Large Language Models [58.36932318051907]
We introduce a modular and adaptive inference-time intervention technology, AutoSteer, without requiring any fine-tuning of the underlying model.<n>AutoSteer incorporates three core components: (1) a novel Safety Awareness Score (SAS) that automatically identifies the most safety-relevant distinctions among the model's internal layers; (2) an adaptive safety prober trained to estimate the likelihood of toxic outputs from intermediate representations; and (3) a lightweight Refusal Head that selectively intervenes to modulate generation when safety risks are detected.
arXiv Detail & Related papers (2025-07-17T16:04:55Z) - LLM Agents Should Employ Security Principles [60.03651084139836]
This paper argues that the well-established design principles in information security should be employed when deploying Large Language Model (LLM) agents at scale.<n>We introduce AgentSandbox, a conceptual framework embedding these security principles to provide safeguards throughout an agent's life-cycle.
arXiv Detail & Related papers (2025-05-29T21:39:08Z) - T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models [88.63040835652902]
Text to video models are vulnerable to jailbreak attacks, where specially crafted prompts bypass safety mechanisms and lead to the generation of harmful or unsafe content.
We propose T2VShield, a comprehensive and model agnostic defense framework designed to protect text to video models from jailbreak threats.
Our method systematically analyzes the input, model, and output stages to identify the limitations of existing defenses.
arXiv Detail & Related papers (2025-04-22T01:18:42Z) - SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models [63.71984266104757]
Multimodal Large Language Models (MLLMs) can process both visual and textual data.
We propose SafeAuto, a novel framework that enhances MLLM-based autonomous driving systems by incorporating both unstructured and structured knowledge.
arXiv Detail & Related papers (2025-02-28T21:53:47Z) - Why Safeguarded Ships Run Aground? Aligned Large Language Models' Safety Mechanisms Tend to Be Anchored in The Template Region [13.962617572588393]
We show that template-anchored safety alignment is widespread across various aligned large language models (LLMs)
Our mechanistic analyses demonstrate how it leads to models' susceptibility when encountering inference-time jailbreak attacks.
We show that detaching safety mechanisms from the template region is promising in mitigating vulnerabilities to jailbreak attacks.
arXiv Detail & Related papers (2025-02-19T18:42:45Z) - Safety case template for frontier AI: A cyber inability argument [2.2628353000034065]
We propose a safety case template for offensive cyber capabilities.
We identify a number of risk models, derive proxy tasks from the risk models, define evaluation settings for the proxy tasks, and connect those with evaluation results.
arXiv Detail & Related papers (2024-11-12T18:45:08Z) - SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models [75.67623347512368]
We propose toolns, a comprehensive framework designed for conducting safety evaluations of MLLMs.
Our framework consists of a comprehensive harmful query dataset and an automated evaluation protocol.
Based on our framework, we conducted large-scale experiments on 15 widely-used open-source MLLMs and 6 commercial MLLMs.
arXiv Detail & Related papers (2024-10-24T17:14:40Z) - Automatic Instantiation of Assurance Cases from Patterns Using Large Language Models [6.314768437420443]
Large Language Models (LLMs) can generate assurance cases that comply with specific patterns.
LLMs exhibit potential in the automatic generation of assurance cases, but their capabilities still fall short compared to human experts.
arXiv Detail & Related papers (2024-10-07T20:58:29Z) - AdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shield Prompting [54.931241667414184]
We propose textbfAdaptive textbfShield Prompting, which prepends inputs with defense prompts to defend MLLMs against structure-based jailbreak attacks.
Our methods can consistently improve MLLMs' robustness against structure-based jailbreak attacks.
arXiv Detail & Related papers (2024-03-14T15:57:13Z) - A SysML Profile for the Standardized Description of Processes during
System Development [40.539768677361735]
The VDI/VDE 3682 standard for Formalised Process De-scription (FPD) provides a simple and easily understandable representation of processes.
This contribution focuses on the development of a Domain-Specific Modeling Language(D) that facilitates the integration of VDI/VDE 3682 into the Systems Modeling Language (SysML)
arXiv Detail & Related papers (2024-03-11T13:44:38Z) - A General Framework for Verification and Control of Dynamical Models via Certificate Synthesis [54.959571890098786]
We provide a framework to encode system specifications and define corresponding certificates.
We present an automated approach to formally synthesise controllers and certificates.
Our approach contributes to the broad field of safe learning for control, exploiting the flexibility of neural networks.
arXiv Detail & Related papers (2023-09-12T09:37:26Z) - Monitoring ROS2: from Requirements to Autonomous Robots [58.720142291102135]
This paper provides an overview of a formal approach to generating runtime monitors for autonomous robots from requirements written in a structured natural language.
Our approach integrates the Formal Requirement Elicitation Tool (FRET) with Copilot, a runtime verification framework, through the Ogma integration tool.
arXiv Detail & Related papers (2022-09-28T12:19:13Z) - Reliability Assessment and Safety Arguments for Machine Learning
Components in Assuring Learning-Enabled Autonomous Systems [19.65793237440738]
We present an overall assurance framework for Learning-Enabled Systems (LES)
We then introduce a novel model-agnostic Reliability Assessment Model (RAM) for ML classifiers.
We discuss the model assumptions and the inherent challenges of assessing ML reliability uncovered by our RAM.
arXiv Detail & Related papers (2021-11-30T14:39:22Z) - The missing link: Developing a safety case for perception components in
automated driving [10.43163823170716]
Perception is a key aspect of automated driving systems (AD) that relies heavily on Machine Learning (ML)
Despite the known challenges with the safety assurance of ML-based components, proposals have recently emerged for unit-level safety cases addressing these components.
We propose a generic template for such a linking argument specifically tailored for perception components.
arXiv Detail & Related papers (2021-08-30T15:12:27Z) - SMT-Based Safety Verification of Data-Aware Processes under Ontologies
(Extended Version) [71.12474112166767]
We introduce a variant of one of the most investigated models in this spectrum, namely simple artifact systems (SASs)
This DL, enjoying suitable model-theoretic properties, allows us to define SASs to which backward reachability can still be applied, leading to decidability in PSPACE of the corresponding safety problems.
arXiv Detail & Related papers (2021-08-27T15:04:11Z) - SMT-based Safety Verification of Parameterised Multi-Agent Systems [78.04236259129524]
We study the verification of parameterised multi-agent systems (MASs)
In particular, we study whether unwanted states, characterised as a given state formula, are reachable in a given MAS.
arXiv Detail & Related papers (2020-08-11T15:24:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.