Related papers: Visibility into AI Agents

Related papers

AURA: An Agent Autonomy Risk Assessment Framework [0.0]
AURA (Agent aUtonomy Risk Assessment) is a unified framework designed to detect, quantify, and mitigate risks arising from agentic AI.<n>AURA provides an interactive process to score, evaluate and mitigate the risks of running one or multiple AI Agents, synchronously or asynchronously.<n>AURA supports a responsible and transparent adoption of agentic AI and provides robust risk detection and mitigation while balancing computational resources.
arXiv Detail & Related papers (2025-10-17T15:30:29Z)
Never Compromise to Vulnerabilities: A Comprehensive Survey on AI Governance [211.5823259429128]
We propose a comprehensive framework integrating technical and societal dimensions, structured around three interconnected pillars: Intrinsic Security, Derivative Security, and Social Ethics.<n>We identify three core challenges: (1) the generalization gap, where defenses fail against evolving threats; (2) inadequate evaluation protocols that overlook real-world risks; and (3) fragmented regulations leading to inconsistent oversight.<n>Our framework offers actionable guidance for researchers, engineers, and policymakers to develop AI systems that are not only robust and secure but also ethically aligned and publicly trustworthy.
arXiv Detail & Related papers (2025-08-12T09:42:56Z)
SafeMobile: Chain-level Jailbreak Detection and Automated Evaluation for Multimodal Mobile Agents [58.21223208538351]
This work explores the security issues surrounding mobile multimodal agents.<n>It attempts to construct a risk discrimination mechanism by incorporating behavioral sequence information.<n>It also designs an automated assisted assessment scheme based on a large language model.
arXiv Detail & Related papers (2025-07-01T15:10:00Z)
Oversight Structures for Agentic AI in Public-Sector Organizations [0.0]
We identify five governance dimensions essential for responsible agent deployment.<n>We find that agent oversight poses intensified versions of three existing governance challenges.<n>We propose approaches that both adapt institutional structures and design agent oversight compatible with public sector constraints.
arXiv Detail & Related papers (2025-06-05T09:57:15Z)
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems [2.462408812529728]
This review presents a structured analysis of textbfTrust, Risk, and Security Management (TRiSM) in the context of LLM-based Agentic Multi-Agent Systems (AMAS)<n>We begin by examining the conceptual foundations of Agentic AI and highlight its architectural distinctions from traditional AI agents.<n>We then adapt and extend the AI TRiSM framework for Agentic AI, structured around four key pillars: Explainability, ModelOps, Security, Privacy and Governance.
arXiv Detail & Related papers (2025-06-04T16:26:11Z)
Inherent and emergent liability issues in LLM-based agentic systems: a principal-agent perspective [0.0]
Agentic systems powered by large language models (LLMs) are becoming progressively more complex and capable. Their increasing agency and expanding deployment settings attract growing attention over effective governance policies, monitoring and control protocols. We analyze the potential liability issues stemming from delegated use of LLM agents and their extended systems from a principal-agent perspective.
arXiv Detail & Related papers (2025-04-04T08:10:02Z)
Agentic Business Process Management: The Past 30 Years And Practitioners' Future Perspectives [0.7270112855088837]
We conduct a series of interviews with BPM practitioners to explore their understanding, expectations, and concerns related to agent autonomy, adaptability, human collaboration, and governance in processes. The findings reflect both challenges with respect to data inconsistencies, manual interventions, identification of process bottlenecks, actionability of process improvements, as well as the opportunities of enhanced efficiency, predictive process insights and proactive decision-making support. These concerns underscore the need for a robust methodological framework for managing agents in organizations.
arXiv Detail & Related papers (2025-03-23T20:15:24Z)
Media and responsible AI governance: a game-theoretic and LLM analysis [61.132523071109354]
This paper investigates the interplay between AI developers, regulators, users, and the media in fostering trustworthy AI systems. Using evolutionary game theory and large language models (LLMs), we model the strategic interactions among these actors under different regulatory regimes.
arXiv Detail & Related papers (2025-03-12T21:39:38Z)
Multi-Agent Risks from Advanced AI [90.74347101431474]
Multi-agent systems of advanced AI pose novel and under-explored risks. We identify three key failure modes based on agents' incentives, as well as seven key risk factors. We highlight several important instances of each risk, as well as promising directions to help mitigate them.
arXiv Detail & Related papers (2025-02-19T23:03:21Z)
AI Agents Should be Regulated Based on the Extent of Their Autonomous Operations [8.043534206868326]
We argue that AI agents should be regulated by the extent to which they operate autonomously.<n>Existing regulations often focus on computational scale as a proxy for potential harm.<n>We discuss relevant regulations and recommendations from scientists regarding existential risks.
arXiv Detail & Related papers (2025-02-07T09:40:48Z)
Criticality and Safety Margins for Reinforcement Learning [53.10194953873209]
We seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users. We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions. We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality.
arXiv Detail & Related papers (2024-09-26T21:00:45Z)
Large Model Based Agents: State-of-the-Art, Cooperation Paradigms, Security and Privacy, and Future Trends [64.57762280003618]
It is foreseeable that in the near future, LM-driven general AI agents will serve as essential tools in production tasks. This paper investigates scenarios involving the autonomous collaboration of future LM agents.
arXiv Detail & Related papers (2024-09-22T14:09:49Z)
Safeguarding AI Agents: Developing and Analyzing Safety Architectures [0.0]
This paper addresses the need for safety measures in AI systems that collaborate with human teams. We propose and evaluate three frameworks to enhance safety protocols in AI agent systems. We conclude that these frameworks can significantly strengthen the safety and security of AI agent systems.
arXiv Detail & Related papers (2024-09-03T10:14:51Z)
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways [10.16690494897609]
An Artificial Intelligence (AI) agent is a software entity that autonomously performs tasks or makes decisions based on pre-defined objectives and data inputs. This survey delves into the emerging security threats faced by AI agents, categorizing them into four critical knowledge gaps. By systematically reviewing these threats, this paper highlights both the progress made and the existing limitations in safeguarding AI agents.
arXiv Detail & Related papers (2024-06-04T01:22:31Z)
Mapping LLM Security Landscapes: A Comprehensive Stakeholder Risk Assessment Proposal [0.0]
We propose a risk assessment process using tools like the risk rating methodology which is used for traditional systems. We conduct scenario analysis to identify potential threat agents and map the dependent system components against vulnerability factors. We also map threats against three key stakeholder groups.
arXiv Detail & Related papers (2024-03-20T05:17:22Z)
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science [65.77763092833348]
Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines. While their capabilities are promising, these agents also introduce novel vulnerabilities that demand careful consideration for safety. This paper conducts a thorough examination of vulnerabilities in LLM-based agents within scientific domains, shedding light on potential risks associated with their misuse and emphasizing the need for safety measures.
arXiv Detail & Related papers (2024-02-06T18:54:07Z)
AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents [76.95062553043607]
evaluating large language models (LLMs) is essential for understanding their capabilities and facilitating their integration into practical applications. We introduce AgentBoard, a pioneering comprehensive benchmark and accompanied open-source evaluation framework tailored to analytical evaluation of LLM agents.
arXiv Detail & Related papers (2024-01-24T01:51:00Z)
PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety [70.84902425123406]
Multi-agent systems, when enhanced with Large Language Models (LLMs), exhibit profound capabilities in collective intelligence. However, the potential misuse of this intelligence for malicious purposes presents significant risks. We propose a framework (PsySafe) grounded in agent psychology, focusing on identifying how dark personality traits in agents can lead to risky behaviors. Our experiments reveal several intriguing phenomena, such as the collective dangerous behaviors among agents, agents' self-reflection when engaging in dangerous behavior, and the correlation between agents' psychological assessments and dangerous behaviors.
arXiv Detail & Related papers (2024-01-22T12:11:55Z)
IxDRL: A Novel Explainable Deep Reinforcement Learning Toolkit based on Analyses of Interestingness [0.0]
We propose a new framework based on analyses of interestingness. Our tool provides various measures of RL agent competence stemming from interestingness analysis. We show that our framework can provide agent designers with insights about RL agent competence.
arXiv Detail & Related papers (2023-07-18T02:43:19Z)
Global and Local Analysis of Interestingness for Competency-Aware Deep Reinforcement Learning [0.0]
We extend a recently-proposed framework for explainable reinforcement learning (RL) based on analyses of "interestingness" Our tools provide insights about RL agent competence, both their capabilities and limitations, enabling users to make more informed decisions.
arXiv Detail & Related papers (2022-11-11T17:48:42Z)
Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions. By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem. We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them. Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z)
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims [59.64274607533249]
AI developers need to make verifiable claims to which they can be held accountable. This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems. We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.
arXiv Detail & Related papers (2020-04-15T17:15:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.