Taming Various Privilege Escalation in LLM-Based Agent Systems: A Mandatory Access Control Framework
- URL: http://arxiv.org/abs/2601.11893v1
- Date: Sat, 17 Jan 2026 03:22:56 GMT
- Title: Taming Various Privilege Escalation in LLM-Based Agent Systems: A Mandatory Access Control Framework
- Authors: Zimo Ji, Daoyuan Wu, Wenyuan Jiang, Pingchuan Ma, Zongjie Li, Yudong Gao, Shuai Wang, Yingjiu Li,
- Abstract summary: Large Language Model (LLM)-based agent systems are increasingly deployed for complex real-world tasks.<n>This paper aims to understand and mitigate such attacks through the lens of privilege escalation.<n>We propose SEAgent, a mandatory access control framework built upon attribute-based access control (ABAC)<n>Our evaluations show that SEAgent effectively blocks various privilege escalation while maintaining a low false positive rate and negligible system overhead.
- Score: 16.14469140816631
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Model (LLM)-based agent systems are increasingly deployed for complex real-world tasks but remain vulnerable to natural language-based attacks that exploit over-privileged tool use. This paper aims to understand and mitigate such attacks through the lens of privilege escalation, defined as agent actions exceeding the least privilege required for a user's intended task. Based on a formal model of LLM agent systems, we identify novel privilege escalation scenarios, particularly in multi-agent systems, including a variant akin to the classic confused deputy problem. To defend against both known and newly demonstrated privilege escalation, we propose SEAgent, a mandatory access control (MAC) framework built upon attribute-based access control (ABAC). SEAgent monitors agent-tool interactions via an information flow graph and enforces customizable security policies based on entity attributes. Our evaluations show that SEAgent effectively blocks various privilege escalation while maintaining a low false positive rate and negligible system overhead. This demonstrates its robustness and adaptability in securing LLM-based agent systems.
Related papers
- Towards Verifiably Safe Tool Use for LLM Agents [53.55621104327779]
Large language model (LLM)-based AI agents extend capabilities by enabling access to tools such as data sources, APIs, search engines, code sandboxes, and even other agents.<n>LLMs may invoke unintended tool interactions and introduce risks, such as leaking sensitive data or overwriting critical records.<n>Current approaches to mitigate these risks, such as model-based safeguards, enhance agents' reliability but cannot guarantee system safety.
arXiv Detail & Related papers (2026-01-12T21:31:38Z) - PSG-Agent: Personality-Aware Safety Guardrail for LLM-based Agents [60.23552141928126]
PSG-Agent is a personalized and dynamic system for LLM-based agents.<n>First, PSG-Agent creates personalized guardrails by mining the interaction history for stable traits.<n>Second, PSG-Agent implements continuous monitoring across the agent pipeline with specialized guards.
arXiv Detail & Related papers (2025-09-28T03:31:59Z) - Secure and Efficient Access Control for Computer-Use Agents via Context Space [11.077973600902853]
CSAgent is a system-level, static policy-based access control framework for computer-use agents.<n>We implement and evaluate CSAgent, which successfully defends against more than 99.36% of attacks while introducing only 6.83% performance overhead.
arXiv Detail & Related papers (2025-09-26T12:19:27Z) - AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents [54.29555239363013]
We propose a generic black-box fuzzing framework, AgentVigil, to automatically discover and exploit indirect prompt injection vulnerabilities.<n>We evaluate AgentVigil on two public benchmarks, AgentDojo and VWA-adv, where it achieves 71% and 70% success rates against agents based on o3-mini and GPT-4o.<n>We apply our attacks in real-world environments, successfully misleading agents to navigate to arbitrary URLs, including malicious sites.
arXiv Detail & Related papers (2025-05-09T07:40:17Z) - SAGA: A Security Architecture for Governing AI Agentic Systems [13.758038956671834]
Large Language Model (LLM)-based agents increasingly interact, collaborate, and delegate tasks to one another autonomously with minimal human interaction.<n>Industry guidelines for agentic system governance emphasize the need for users to maintain comprehensive control over their agents.<n>We propose SAGA, a scalable Security Architecture for Governing Agentic systems, that offers user oversight over their agents' lifecycle.
arXiv Detail & Related papers (2025-04-27T23:10:00Z) - Progent: Programmable Privilege Control for LLM Agents [46.31581986508561]
We introduce Progent, the first privilege control framework to secure Large Language Models agents.<n>Progent enforces security at the tool level by restricting agents to performing tool calls necessary for user tasks while blocking potentially malicious ones.<n>Thanks to our modular design, integrating Progent does not alter agent internals and only requires minimal changes to the existing agent implementation.
arXiv Detail & Related papers (2025-04-16T01:58:40Z) - Prompt Flow Integrity to Prevent Privilege Escalation in LLM Agents [12.072737324367937]
We propose Prompt Flow Integrity (PFI) to prevent privilege escalation in Large Language Models (LLMs)<n>PFI features three mitigation techniques -- i.e., agent isolation, secure untrusted data processing, and privilege escalation guardrails.<n>Our evaluation result shows that PFI effectively mitigates privilege escalation attacks while successfully preserving the utility of LLM agents.
arXiv Detail & Related papers (2025-03-17T05:27:57Z) - Automating Prompt Leakage Attacks on Large Language Models Using Agentic Approach [9.483655213280738]
This paper presents a novel approach to evaluating the security of large language models (LLMs)<n>We define prompt leakage as a critical threat to secure LLM deployment.<n>We implement a multi-agent system where cooperative agents are tasked with probing and exploiting the target LLM to elicit its prompt.
arXiv Detail & Related papers (2025-02-18T08:17:32Z) - Agent-as-a-Judge: Evaluate Agents with Agents [61.33974108405561]
We introduce the Agent-as-a-Judge framework, wherein agentic systems are used to evaluate agentic systems.
This is an organic extension of the LLM-as-a-Judge framework, incorporating agentic features that enable intermediate feedback for the entire task-solving process.
We present DevAI, a new benchmark of 55 realistic automated AI development tasks.
arXiv Detail & Related papers (2024-10-14T17:57:02Z) - AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents [84.96249955105777]
LLM agents may pose a greater risk if misused, but their robustness remains underexplored.<n>We propose a new benchmark called AgentHarm to facilitate research on LLM agent misuse.<n>We find leading LLMs are surprisingly compliant with malicious agent requests without jailbreaking.
arXiv Detail & Related papers (2024-10-11T17:39:22Z) - GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning [79.07152553060601]
We propose GuardAgent, the first guardrail agent to protect target agents by dynamically checking whether their actions satisfy given safety guard requests.<n>Specifically, GuardAgent first analyzes the safety guard requests to generate a task plan, and then maps this plan into guardrail code for execution.<n>We show that GuardAgent effectively moderates the violation actions for different types of agents on two benchmarks with over 98% and 83% guardrail accuracies, respectively.
arXiv Detail & Related papers (2024-06-13T14:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.