Related papers: Err

Related papers

The BIG Argument for AI Safety Cases [4.0675753909100445]
The BIG argument adopts a whole-system approach to constructing a safety case for AI systems of varying capability, autonomy and criticality. It is balanced by addressing safety alongside other critical ethical issues such as privacy and equity. It is integrated by bringing together the social, ethical and technical aspects of safety assurance in a way that is traceable and accountable.
arXiv Detail & Related papers (2025-03-12T11:33:28Z)
AI Safety for Everyone [3.440579243843689]
Recent discussions and research in AI safety have increasingly emphasized the deep connection between AI safety and existential risk from advanced AI systems. This framing may exclude researchers and practitioners who are committed to AI safety but approach the field from different angles. We find a vast array of concrete safety work that addresses immediate and practical concerns with current AI systems.
arXiv Detail & Related papers (2025-02-13T13:04:59Z)
Position: A taxonomy for reporting and describing AI security incidents [57.98317583163334]
We argue that specific are required to describe and report security incidents of AI systems. Existing frameworks for either non-AI security or generic AI safety incident reporting are insufficient to capture the specific properties of AI security.
arXiv Detail & Related papers (2024-12-19T13:50:26Z)
Safety case template for frontier AI: A cyber inability argument [2.2628353000034065]
We propose a safety case template for offensive cyber capabilities. We identify a number of risk models, derive proxy tasks from the risk models, define evaluation settings for the proxy tasks, and connect those with evaluation results.
arXiv Detail & Related papers (2024-11-12T18:45:08Z)
Safety cases for frontier AI [0.8987776881291144]
Safety cases are reports that make a structured argument, supported by evidence, that a system is safe enough in a given operational context. Safety cases are already common in other safety-critical industries such as aviation and nuclear power. We explain why they may also be a useful tool in frontier AI governance, both in industry self-regulation and government regulation.
arXiv Detail & Related papers (2024-10-28T22:08:28Z)
Multimodal Situational Safety [73.63981779844916]
We present the first evaluation and analysis of a novel safety challenge termed Multimodal Situational Safety. For an MLLM to respond safely, whether through language or action, it often needs to assess the safety implications of a language query within its corresponding visual context. We develop the Multimodal Situational Safety benchmark (MSSBench) to assess the situational safety performance of current MLLMs.
arXiv Detail & Related papers (2024-10-08T16:16:07Z)
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context. We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z)
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems [88.80306881112313]
We will introduce and define a family of approaches to AI safety, which we will refer to as guaranteed safe (GS) AI. The core feature of these approaches is that they aim to produce AI systems which are equipped with high-assurance quantitative safety guarantees. We outline a number of approaches for creating each of these three core components, describe the main technical challenges, and suggest a number of potential solutions to them.
arXiv Detail & Related papers (2024-05-10T17:38:32Z)
The Open Autonomy Safety Case Framework [3.2995359570845917]
Safety cases have become a best practice for measuring, managing, and communicating the safety of autonomous vehicles. This paper introduces the Open Autonomy Safety Case Framework, developed over years of work with the autonomous vehicle industry.
arXiv Detail & Related papers (2024-04-08T12:26:06Z)
The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness [56.174255970895466]
Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications. This paper presents Safety and Over-Defensiveness Evaluation (SODE) benchmark.
arXiv Detail & Related papers (2023-12-30T17:37:06Z)
Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements [76.80453043969209]
This survey presents a framework for safety research pertaining to large models. We begin by introducing safety issues of wide concern, then delve into safety evaluation methods for large models. We explore the strategies for enhancing large model safety from training to deployment.
arXiv Detail & Related papers (2023-02-18T09:32:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.