Related papers: Usage Governance Advisor: From Intent to AI Governance

Usage Governance Advisor: From Intent to AI Governance

URL: http://arxiv.org/abs/2412.01957v2
Date: Thu, 23 Jan 2025 14:49:53 GMT
Title: Usage Governance Advisor: From Intent to AI Governance
Authors: Elizabeth M. Daly, Sean Rooney, Seshu Tirupathi, Luis Garces-Erice, Inge Vejsbjerg, Frank Bagehorn, Dhaval Salwala, Christopher Giblin, Mira L. Wolf-Bauwens, Ioana Giurgiu, Michael Hind, Peter Urbanetz,
Abstract summary: evaluating the safety of AI systems is a pressing concern for organizations deploying them.<n>We present Usage Governance Advisor which creates semi-structured governance information.
Score: 4.49852442764084
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Evaluating the safety of AI Systems is a pressing concern for organizations deploying them. In addition to the societal damage done by the lack of fairness of those systems, deployers are concerned about the legal repercussions and the reputational damage incurred by the use of models that are unsafe. Safety covers both what a model does; e.g., can it be used to reveal personal information from its training set, and how a model was built; e.g., was it only trained on licensed data sets. Determining the safety of an AI system requires gathering information from a wide set of heterogeneous sources including safety benchmarks and technical documentation for the set of models used in that system. In addition, responsible use is encouraged through mechanisms that advise and help the user to take mitigating actions where safety risks are detected. We present Usage Governance Advisor which creates semi-structured governance information, identifies and prioritizes risks according to the intended use case, recommends appropriate benchmarks and risk assessments and importantly proposes mitigation strategies and actions.

Related papers

Towards Verifiably Safe Tool Use for LLM Agents [53.55621104327779]
Large language model (LLM)-based AI agents extend capabilities by enabling access to tools such as data sources, APIs, search engines, code sandboxes, and even other agents.<n>LLMs may invoke unintended tool interactions and introduce risks, such as leaking sensitive data or overwriting critical records.<n>Current approaches to mitigate these risks, such as model-based safeguards, enhance agents' reliability but cannot guarantee system safety.
arXiv Detail & Related papers (2026-01-12T21:31:38Z)
Dataset Safety in Autonomous Driving: Requirements, Risks, and Assurance [1.5495593104596397]
This paper presents a structured framework for developing safe datasets aligned with ISO/PAS 8800 guidelines.<n>Using AI-based perception systems as the primary use case, it introduces the AI Data Flywheel and the dataset lifecycle.<n>The framework incorporates rigorous safety analyses to identify hazards and mitigate risks caused by dataset insufficiencies.
arXiv Detail & Related papers (2025-11-11T16:42:47Z)
Manipulation Attacks by Misaligned AI: Risk Analysis and Safety Case Framework [0.0]
Humans are often the weakest link in cybersecurity systems.<n>A misaligned AI system may seek to undermine human oversight by manipulating employees.<n>No systematic framework exists for assessing and mitigating these risks.<n>This paper provides the first systematic methodology for integrating manipulation risk into AI safety governance.
arXiv Detail & Related papers (2025-07-17T07:45:53Z)
LLM Agents Should Employ Security Principles [60.03651084139836]
This paper argues that the well-established design principles in information security should be employed when deploying Large Language Model (LLM) agents at scale.<n>We introduce AgentSandbox, a conceptual framework embedding these security principles to provide safeguards throughout an agent's life-cycle.
arXiv Detail & Related papers (2025-05-29T21:39:08Z)
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator [77.86600052899156]
Large Language Model (LLM)-based agents are increasingly deployed in real-world applications.<n>We propose AutoSafe, the first framework that systematically enhances agent safety through fully automated synthetic data generation.<n>We show that AutoSafe boosts safety scores by 45% on average and achieves a 28.91% improvement on real-world tasks.
arXiv Detail & Related papers (2025-05-23T10:56:06Z)
An Approach to Technical AGI Safety and Security [72.83728459135101]
We develop an approach to address the risk of harms consequential enough to significantly harm humanity. We focus on technical approaches to misuse and misalignment. We briefly outline how these ingredients could be combined to produce safety cases for AGI systems.
arXiv Detail & Related papers (2025-04-02T15:59:31Z)
A First-Principles Based Risk Assessment Framework and the IEEE P3396 Standard [0.0]
Generative Artificial Intelligence (AI) is enabling unprecedented automation in content creation and decision support. This paper presents a first-principles risk assessment framework underlying the IEEE P3396 Recommended Practice for AI Risk, Safety, Trustworthiness, and Responsibility.
arXiv Detail & Related papers (2025-03-31T18:00:03Z)
AILuminate: Introducing v1.0 of the AI Risk and Reliability Benchmark from MLCommons [62.374792825813394]
This paper introduces AILuminate v1.0, the first comprehensive industry-standard benchmark for assessing AI-product risk and reliability. The benchmark evaluates an AI system's resistance to prompts designed to elicit dangerous, illegal, or undesirable behavior in 12 hazard categories.
arXiv Detail & Related papers (2025-02-19T05:58:52Z)
Position: A taxonomy for reporting and describing AI security incidents [57.98317583163334]
We argue that specific are required to describe and report security incidents of AI systems. Existing frameworks for either non-AI security or generic AI safety incident reporting are insufficient to capture the specific properties of AI security.
arXiv Detail & Related papers (2024-12-19T13:50:26Z)
Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems [2.3266896180922187]
We compile an extensive catalog of risk sources and risk management measures for general-purpose AI systems. This work involves identifying technical, operational, and societal risks across model development, training, and deployment stages. The catalog is released under a public domain license for ease of direct use by stakeholders in AI governance and standards.
arXiv Detail & Related papers (2024-10-30T21:32:56Z)
"Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models [74.05368440735468]
Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs) In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases.
arXiv Detail & Related papers (2024-06-26T05:36:23Z)
Leveraging Traceability to Integrate Safety Analysis Artifacts into the Software Development Process [51.42800587382228]
Safety assurance cases (SACs) can be challenging to maintain during system evolution. We propose a solution that leverages software traceability to connect relevant system artifacts to safety analysis models. We elicit design rationales for system changes to help safety stakeholders analyze the impact of system changes on safety.
arXiv Detail & Related papers (2023-07-14T16:03:27Z)
Model evaluation for extreme risks [46.53170857607407]
Further progress in AI development could lead to capabilities that pose extreme risks, such as offensive cyber capabilities or strong manipulation skills. We explain why model evaluation is critical for addressing extreme risks.
arXiv Detail & Related papers (2023-05-24T16:38:43Z)
Foveate, Attribute, and Rationalize: Towards Physically Safe and Trustworthy AI [76.28956947107372]
Covertly unsafe text is an area of particular interest, as such text may arise from everyday scenarios and are challenging to detect as harmful. We propose FARM, a novel framework leveraging external knowledge for trustworthy rationale generation in the context of safety. Our experiments show that FARM obtains state-of-the-art results on the SafeText dataset, showing absolute improvement in safety classification accuracy by 5.9%.
arXiv Detail & Related papers (2022-12-19T17:51:47Z)
Assurance Cases as Foundation Stone for Auditing AI-enabled and Autonomous Systems: Workshop Results and Political Recommendations for Action from the ExamAI Project [2.741266294612776]
We investigate the way safety standards define safety measures to be implemented against software faults. Functional safety standards use Safety Integrity Levels (SILs) to define which safety measures shall be implemented. We propose the use of assurance cases to argue that the individually selected and applied measures are sufficient.
arXiv Detail & Related papers (2022-08-17T10:05:07Z)
Risk Management Framework for Machine Learning Security [7.678455181587705]
Adversarial attacks for machine learning models have become a highly studied topic both in academia and industry. In this paper, we outline a novel framework to guide the risk management process for organizations reliant on machine learning models.
arXiv Detail & Related papers (2020-12-09T06:21:34Z)
Evaluating the Safety of Deep Reinforcement Learning Models using Semi-Formal Verification [81.32981236437395]
We present a semi-formal verification approach for decision-making tasks based on interval analysis. Our method obtains comparable results over standard benchmarks with respect to formal verifiers. Our approach allows to efficiently evaluate safety properties for decision-making models in practical applications.
arXiv Detail & Related papers (2020-10-19T11:18:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.