Related papers: From nuclear safety to LLM security: Applying non-probabilistic risk management strategies to build safe and secure LLM-powered systems

From nuclear safety to LLM security: Applying non-probabilistic risk management strategies to build safe and secure LLM-powered systems

URL: http://arxiv.org/abs/2505.17084v1
Date: Tue, 20 May 2025 16:07:41 GMT
Title: From nuclear safety to LLM security: Applying non-probabilistic risk management strategies to build safe and secure LLM-powered systems
Authors: Alexander Gutfraind, Vicki Bier,
Abstract summary: Large language models (LLMs) offer unprecedented and growing capabilities, but also introduce complex safety and security challenges.<n>Previous research found that risk management in various fields of engineering such as nuclear or civil engineering is often solved by generic (i.e. field-agnostic) strategies.<n>Here we show how emerging risks in LLM-powered systems could be met with 100+ of these non-probabilistic strategies to risk management.
Score: 49.1574468325115
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) offer unprecedented and growing capabilities, but also introduce complex safety and security challenges that resist conventional risk management. While conventional probabilistic risk analysis (PRA) requires exhaustive risk enumeration and quantification, the novelty and complexity of these systems make PRA impractical, particularly against adaptive adversaries. Previous research found that risk management in various fields of engineering such as nuclear or civil engineering is often solved by generic (i.e. field-agnostic) strategies such as event tree analysis or robust designs. Here we show how emerging risks in LLM-powered systems could be met with 100+ of these non-probabilistic strategies to risk management, including risks from adaptive adversaries. The strategies are divided into five categories and are mapped to LLM security (and AI safety more broadly). We also present an LLM-powered workflow for applying these strategies and other workflows suitable for solution architects. Overall, these strategies could contribute (despite some limitations) to security, safety and other dimensions of responsible AI.

Related papers

TELSAFE: Security Gap Quantitative Risk Assessment Framework [9.16098821053237]
Gaps between established security standards and their practical implementation have the potential to introduce vulnerabilities.<n>We introduce a new hybrid risk assessment framework called TELSAFE, which employs probabilistic modeling for quantitative risk assessment.
arXiv Detail & Related papers (2025-07-09T02:45:00Z)
An Approach to Technical AGI Safety and Security [72.83728459135101]
We develop an approach to address the risk of harms consequential enough to significantly harm humanity.<n>We focus on technical approaches to misuse and misalignment.<n>We briefly outline how these ingredients could be combined to produce safety cases for AGI systems.
arXiv Detail & Related papers (2025-04-02T15:59:31Z)
A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management [0.0]
The recent development of powerful AI systems has highlighted the need for robust risk management frameworks.<n>This paper presents a comprehensive risk management framework for the development of frontier AI.
arXiv Detail & Related papers (2025-02-10T16:47:00Z)
Large Language Model Safety: A Holistic Survey [35.42419096859496]
The rapid development and deployment of large language models (LLMs) have introduced a new frontier in artificial intelligence.<n>This survey provides a comprehensive overview of the current landscape of LLM safety, covering four major categories: value misalignment, robustness to adversarial attacks, misuse, and autonomous AI risks.
arXiv Detail & Related papers (2024-12-23T16:11:27Z)
EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.<n>Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.<n>However, the deployment of these agents in physical environments presents significant safety challenges.<n>This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z)
Mapping LLM Security Landscapes: A Comprehensive Stakeholder Risk Assessment Proposal [0.0]
We propose a risk assessment process using tools like the risk rating methodology which is used for traditional systems. We conduct scenario analysis to identify potential threat agents and map the dependent system components against vulnerability factors. We also map threats against three key stakeholder groups.
arXiv Detail & Related papers (2024-03-20T05:17:22Z)
RiskQ: Risk-sensitive Multi-Agent Reinforcement Learning Value Factorization [49.26510528455664]
We introduce the Risk-sensitive Individual-Global-Max (RIGM) principle as a generalization of the Individual-Global-Max (IGM) and Distributional IGM (DIGM) principles. We show that RiskQ can obtain promising performance through extensive experiments.
arXiv Detail & Related papers (2023-11-03T07:18:36Z)
On strategies for risk management and decision making under uncertainty shared across multiple fields [55.2480439325792]
The paper finds more than 110 examples of such strategies and this approach to risk is termed RDOT: Risk-reducing Design and Operations Toolkit.<n>RDOT strategies fall into six broad categories: structural, reactive, formal, adversarial, multi-stage and positive.<n>Overall, RDOT represents an overlooked class of versatile responses to uncertainty.
arXiv Detail & Related papers (2023-09-06T16:14:32Z)
System Safety Engineering for Social and Ethical ML Risks: A Case Study [0.5249805590164902]
Governments, industry, and academia have undertaken efforts to identify and mitigate harms in ML-driven systems. Existing approaches are largely disjointed, ad-hoc and of unknown effectiveness. We focus in particular on how this analysis can extend to identifying social and ethical risks and developing concrete design-level controls to mitigate them.
arXiv Detail & Related papers (2022-11-08T22:58:58Z)
Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns. We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it. We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.