Encrypted Prompt: Securing LLM Applications Against Unauthorized Actions
- URL: http://arxiv.org/abs/2503.23250v1
- Date: Sat, 29 Mar 2025 23:26:57 GMT
- Title: Encrypted Prompt: Securing LLM Applications Against Unauthorized Actions
- Authors: Shih-Han Chan,
- Abstract summary: Security threats like prompt injection attacks pose significant risks to applications that integrate Large Language Models.<n>This paper introduces a novel method that appends an Encrypted Prompt to each user prompt, embedding current permissions.
- Score: 2.1756081703276
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Security threats like prompt injection attacks pose significant risks to applications that integrate Large Language Models (LLMs), potentially leading to unauthorized actions such as API misuse. Unlike previous approaches that aim to detect these attacks on a best-effort basis, this paper introduces a novel method that appends an Encrypted Prompt to each user prompt, embedding current permissions. These permissions are verified before executing any actions (such as API calls) generated by the LLM. If the permissions are insufficient, the LLM's actions will not be executed, ensuring safety. This approach guarantees that only actions within the scope of the current permissions from the LLM can proceed. In scenarios where adversarial prompts are introduced to mislead the LLM, this method ensures that any unauthorized actions from LLM wouldn't be executed by verifying permissions in Encrypted Prompt. Thus, threats like prompt injection attacks that trigger LLM to generate harmful actions can be effectively mitigated.
Related papers
- SecurityLingua: Efficient Defense of LLM Jailbreak Attacks via Security-Aware Prompt Compression [11.839827036296649]
Large language models (LLMs) are vulnerable to malicious attacks even after safety alignment.<n>We propose SecurityLingua, an effective and efficient approach to defend LLMs against jailbreak attacks.<n>Thanks to prompt compression, SecurityLingua incurs only a negligible overhead and extra token cost compared to all existing defense methods.
arXiv Detail & Related papers (2025-06-15T03:39:13Z) - Defeating Prompt Injections by Design [79.00910871948787]
CaMeL is a robust defense that creates a protective system layer around the Large Language Models (LLMs)
To operate, CaMeL explicitly extracts the control and data flows from the (trusted) query.
We demonstrate effectiveness of CaMeL by solving $67%$ of tasks with provable security in AgentDojo [NeurIPS 2024], a recent agentic security benchmark.
arXiv Detail & Related papers (2025-03-24T15:54:10Z) - Prompt Flow Integrity to Prevent Privilege Escalation in LLM Agents [12.072737324367937]
Large Language Models (LLMs) are combined with plugins to create powerful LLM agents.<n>LLMs's behavior is determined at runtime by natural language prompts from either user or plugin's data.<n>We propose Prompt Flow Integrity (PFI) to prevent privilege escalation in LLM agents.
arXiv Detail & Related papers (2025-03-17T05:27:57Z) - Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks [88.84977282952602]
A high volume of recent ML security literature focuses on attacks against aligned large language models (LLMs)<n>In this paper, we analyze security and privacy vulnerabilities that are unique to LLM agents.<n>We conduct a series of illustrative attacks on popular open-source and commercial agents, demonstrating the immediate practical implications of their vulnerabilities.
arXiv Detail & Related papers (2025-02-12T17:19:36Z) - Denial-of-Service Poisoning Attacks against Large Language Models [64.77355353440691]
LLMs are vulnerable to denial-of-service (DoS) attacks, where spelling errors or non-semantic prompts trigger endless outputs without generating an [EOS] token.
We propose poisoning-based DoS attacks for LLMs, demonstrating that injecting a single poisoned sample designed for DoS purposes can break the output length limit.
arXiv Detail & Related papers (2024-10-14T17:39:31Z) - Human-Interpretable Adversarial Prompt Attack on Large Language Models with Situational Context [49.13497493053742]
This research explores converting a nonsensical suffix attack into a sensible prompt via a situation-driven contextual re-writing.
We combine an independent, meaningful adversarial insertion and situations derived from movies to check if this can trick an LLM.
Our approach demonstrates that a successful situation-driven attack can be executed on both open-source and proprietary LLMs.
arXiv Detail & Related papers (2024-07-19T19:47:26Z) - Knowledge Return Oriented Prompting (KROP) [0.0]
KROP is a prompt injection technique capable of obfuscating prompt injection attacks.
This paper introduces KROP, a prompt injection technique capable of obfuscating prompt injection attacks.
arXiv Detail & Related papers (2024-06-11T23:58:37Z) - Uncovering Safety Risks of Large Language Models through Concept Activation Vector [13.804245297233454]
We introduce a Safety Concept Activation Vector (SCAV) framework to guide attacks on large language models (LLMs)<n>We then develop an SCAV-guided attack method that can generate both attack prompts and embedding-level attacks.<n>Our attack method significantly improves the attack success rate and response quality while requiring less training data.
arXiv Detail & Related papers (2024-04-18T09:46:25Z) - Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models [79.0183835295533]
We introduce the first benchmark for indirect prompt injection attacks, named BIPIA, to assess the risk of such vulnerabilities.<n>Our analysis identifies two key factors contributing to their success: LLMs' inability to distinguish between informational context and actionable instructions, and their lack of awareness in avoiding the execution of instructions within external content.<n>We propose two novel defense mechanisms-boundary awareness and explicit reminder-to address these vulnerabilities in both black-box and white-box settings.
arXiv Detail & Related papers (2023-12-21T01:08:39Z) - Identifying and Mitigating Vulnerabilities in LLM-Integrated
Applications [37.316238236750415]
Large language models (LLMs) are increasingly deployed as the service backend for LLM-integrated applications.
In this work, we consider a setup where the user and LLM interact via an LLM-integrated application in the middle.
We identify potential vulnerabilities that can originate from the malicious application developer or from an outsider threat.
We develop a lightweight, threat-agnostic defense that mitigates both insider and outsider threats.
arXiv Detail & Related papers (2023-11-07T20:13:05Z) - Evaluating the Instruction-Following Robustness of Large Language Models
to Prompt Injection [70.28425745910711]
Large Language Models (LLMs) have demonstrated exceptional proficiency in instruction-following.
This capability brings with it the risk of prompt injection attacks.
We evaluate the robustness of instruction-following LLMs against such attacks.
arXiv Detail & Related papers (2023-08-17T06:21:50Z) - Not what you've signed up for: Compromising Real-World LLM-Integrated
Applications with Indirect Prompt Injection [64.67495502772866]
Large Language Models (LLMs) are increasingly being integrated into various applications.
We show how attackers can override original instructions and employed controls using Prompt Injection attacks.
We derive a comprehensive taxonomy from a computer security perspective to systematically investigate impacts and vulnerabilities.
arXiv Detail & Related papers (2023-02-23T17:14:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.