ConfusedPilot: Confused Deputy Risks in RAG-based LLMs
- URL: http://arxiv.org/abs/2408.04870v5
- Date: Wed, 23 Oct 2024 05:55:31 GMT
- Title: ConfusedPilot: Confused Deputy Risks in RAG-based LLMs
- Authors: Ayush RoyChowdhury, Mulong Luo, Prateek Sahu, Sarbartha Banerjee, Mohit Tiwari,
- Abstract summary: We introduce ConfusedPilot, a class of security vulnerabilities of RAG systems that confuse Copilot and cause integrity and confidentiality violations in its responses.
This study highlights the security vulnerabilities in today's RAG-based systems and proposes design guidelines to secure future RAG-based systems.
- Score: 2.423202571519879
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval augmented generation (RAG) is a process where a large language model (LLM) retrieves useful information from a database and then generates the responses. It is becoming popular in enterprise settings for daily business operations. For example, Copilot for Microsoft 365 has accumulated millions of businesses. However, the security implications of adopting such RAG-based systems are unclear. In this paper, we introduce ConfusedPilot, a class of security vulnerabilities of RAG systems that confuse Copilot and cause integrity and confidentiality violations in its responses. First, we investigate a vulnerability that embeds malicious text in the modified prompt in RAG, corrupting the responses generated by the LLM. Second, we demonstrate a vulnerability that leaks secret data, which leverages the caching mechanism during retrieval. Third, we investigate how both vulnerabilities can be exploited to propagate misinformation within the enterprise and ultimately impact its operations, such as sales and manufacturing. We also discuss the root cause of these attacks by investigating the architecture of a RAG-based system. This study highlights the security vulnerabilities in today's RAG-based systems and proposes design guidelines to secure future RAG-based systems.
Related papers
- Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey [92.36487127683053]
Retrieval-Augmented Generation (RAG) is an advanced technique designed to address the challenges of Artificial Intelligence-Generated Content (AIGC)
RAG provides reliable and up-to-date external knowledge, reduces hallucinations, and ensures relevant context across a wide range of tasks.
Despite RAG's success and potential, recent studies have shown that the RAG paradigm also introduces new risks, including privacy concerns, adversarial attacks, and accountability issues.
arXiv Detail & Related papers (2025-02-08T06:50:47Z) - SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model [17.046058202577985]
We introduce a benchmark named SafeRAG designed to evaluate the RAG security.
First, we classify attack tasks into silver noise, inter-context conflict, soft ad, and white Denial-of-Service.
We then utilize the SafeRAG dataset to simulate various attack scenarios that RAG may encounter.
arXiv Detail & Related papers (2025-01-28T17:01:31Z) - Pirates of the RAG: Adaptively Attacking LLMs to Leak Knowledge Bases [11.101624331624933]
This paper presents a black-box attack to force a RAG system to leak its private knowledge base.
A relevance-based mechanism and an attacker-side open-source LLM favor the generation of effective queries to leak most of the (hidden) knowledge base.
arXiv Detail & Related papers (2024-12-24T09:03:57Z) - HijackRAG: Hijacking Attacks against Retrieval-Augmented Large Language Models [18.301965456681764]
We reveal a novel vulnerability, the retrieval prompt hijack attack (HijackRAG)
HijackRAG enables attackers to manipulate the retrieval mechanisms of RAG systems by injecting malicious texts into the knowledge database.
We propose both black-box and white-box attack strategies tailored to different levels of the attacker's knowledge.
arXiv Detail & Related papers (2024-10-30T09:15:51Z) - Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs)
We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z) - "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models [74.05368440735468]
Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs)
In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases.
arXiv Detail & Related papers (2024-06-26T05:36:23Z) - Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation [0.9217021281095907]
We introduce an efficient and easy-to-use method for conducting a Membership Inference Attack (MIA) against RAG systems.
We demonstrate the effectiveness of our attack using two benchmark datasets and multiple generative models.
Our findings highlight the importance of implementing security countermeasures in deployed RAG systems.
arXiv Detail & Related papers (2024-05-30T19:46:36Z) - The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented
Generation (RAG) [56.67603627046346]
Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data.
In this work, we conduct empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database.
arXiv Detail & Related papers (2024-02-23T18:35:15Z) - Seven Failure Points When Engineering a Retrieval Augmented Generation
System [1.8776685617612472]
RAG systems aim to reduce the problem of hallucinated responses from large language models.
RAG systems suffer from limitations inherent to information retrieval systems.
We present an experience report on the failure points of RAG systems from three case studies.
arXiv Detail & Related papers (2024-01-11T12:04:11Z) - On the Security Risks of Knowledge Graph Reasoning [71.64027889145261]
We systematize the security threats to KGR according to the adversary's objectives, knowledge, and attack vectors.
We present ROAR, a new class of attacks that instantiate a variety of such threats.
We explore potential countermeasures against ROAR, including filtering of potentially poisoning knowledge and training with adversarially augmented queries.
arXiv Detail & Related papers (2023-05-03T18:47:42Z) - Invisible for both Camera and LiDAR: Security of Multi-Sensor Fusion
based Perception in Autonomous Driving Under Physical-World Attacks [62.923992740383966]
We present the first study of security issues of MSF-based perception in AD systems.
We generate a physically-realizable, adversarial 3D-printed object that misleads an AD system to fail in detecting it and thus crash into it.
Our results show that the attack achieves over 90% success rate across different object types and MSF.
arXiv Detail & Related papers (2021-06-17T05:11:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.