Related papers: Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation

Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation

URL: http://arxiv.org/abs/2405.20446v2
Date: Fri, 7 Jun 2024 09:39:39 GMT
Title: Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation
Authors: Maya Anderson, Guy Amit, Abigail Goldsteen,
Abstract summary: We introduce an efficient and easy-to-use method for conducting a Membership Inference Attack (MIA) against RAG systems. We demonstrate the effectiveness of our attack using two benchmark datasets and multiple generative models. Our findings highlight the importance of implementing security countermeasures in deployed RAG systems.
Score: 0.9217021281095907
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Retrieval Augmented Generation (RAG) systems have shown great promise in natural language processing. However, their reliance on data stored in a retrieval database, which may contain proprietary or sensitive information, introduces new privacy concerns. Specifically, an attacker may be able to infer whether a certain text passage appears in the retrieval database by observing the outputs of the RAG system, an attack known as a Membership Inference Attack (MIA). Despite the significance of this threat, MIAs against RAG systems have yet remained under-explored. This study addresses this gap by introducing an efficient and easy-to-use method for conducting MIA against RAG systems. We demonstrate the effectiveness of our attack using two benchmark datasets and multiple generative models, showing that the membership of a document in the retrieval database can be efficiently determined through the creation of an appropriate prompt in both black-box and gray-box settings. Moreover, we introduce an initial defense strategy based on adding instructions to the RAG template, which shows high effectiveness for some datasets and models. Our findings highlight the importance of implementing security countermeasures in deployed RAG systems and developing more advanced defenses to protect the privacy and security of retrieval databases.

Related papers

DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective [59.66984417026933]
We introduce a novel taxonomy, classifying existing methods based on their reliance on internal features (IF) (inherent to the data) versus external features (EF) (artificially introduced for auditing)<n>We formulate two primary attack types: evasion attacks, designed to conceal the use of a dataset, and forgery attacks, intending to falsely implicate an unused dataset.<n>Building on the understanding of existing methods and attack objectives, we further propose systematic attack strategies: decoupling, removal, and detection for evasion; adversarial example-based methods for forgery.<n>Our benchmark, DATABench, comprises 17 evasion attacks, 5 forgery attacks, and 9
arXiv Detail & Related papers (2025-07-08T03:07:15Z)
MrM: Black-Box Membership Inference Attacks against Multimodal RAG Systems [31.53306157650065]
Multimodal retrieval-augmented generation (RAG) systems enhance large vision-language models by integrating cross-modal knowledge.<n>These knowledge databases may contain sensitive information that requires privacy protection.<n>MrM is the first black-box MIA framework targeted at multimodal RAG systems.
arXiv Detail & Related papers (2025-06-09T03:48:50Z)
Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home? [4.488261272565345]
Mirabel is a similarity-based MIA detection framework designed for the RAG system.<n>We show that simple detect-and-hide strategies can successfully obfuscate attackers, maintain data utility, and remain system-agnostic.
arXiv Detail & Related papers (2025-05-28T07:35:07Z)
Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation [60.81109086640437]
We propose a novel framework called Federated Retrieval-Augmented Generation (FedE4RAG) FedE4RAG facilitates collaborative training of client-side RAG retrieval models. We apply homomorphic encryption within federated learning to safeguard model parameters.
arXiv Detail & Related papers (2025-04-27T04:26:02Z)
MES-RAG: Bringing Multi-modal, Entity-Storage, and Secure Enhancements to RAG [65.0423152595537]
We propose MES-RAG, which enhances entity-specific query handling and provides accurate, secure, and consistent responses. MES-RAG introduces proactive security measures that ensure system integrity by applying protections prior to data access. Experimental results demonstrate that MES-RAG significantly improves both accuracy and recall, highlighting its effectiveness in advancing the security and utility of question-answering.
arXiv Detail & Related papers (2025-03-17T08:09:42Z)
Dataset Protection via Watermarked Canaries in Retrieval-Augmented LLMs [67.0310240737424]
We introduce a novel approach to safeguard the ownership of text datasets and effectively detect unauthorized use by the RA-LLMs. Our approach preserves the original data completely unchanged while protecting it by inserting specifically designed canary documents into the IP dataset. During the detection process, unauthorized usage is identified by querying the canary documents and analyzing the responses of RA-LLMs.
arXiv Detail & Related papers (2025-02-15T04:56:45Z)
MARAGE: Transferable Multi-Model Adversarial Attack for Retrieval-Augmented Generation Data Extraction [6.917134562107388]
Retrieval-Augmented Generation (RAG) offers a solution to hallucinations in Large Language Models (LLMs) by grounding their outputs to knowledge retrieved from external sources. Existing RAG extraction attacks often rely on manually crafted prompts, which limit their effectiveness. We introduce a framework called MARAGE for optimizing an adversarial string that, when appended to user queries submitted to a target RAG system, causes outputs containing the retrieved RAG data.
arXiv Detail & Related papers (2025-02-05T00:17:01Z)
HijackRAG: Hijacking Attacks against Retrieval-Augmented Large Language Models [18.301965456681764]
We reveal a novel vulnerability, the retrieval prompt hijack attack (HijackRAG) HijackRAG enables attackers to manipulate the retrieval mechanisms of RAG systems by injecting malicious texts into the knowledge database. We propose both black-box and white-box attack strategies tailored to different levels of the attacker's knowledge.
arXiv Detail & Related papers (2024-10-30T09:15:51Z)
Mask-based Membership Inference Attacks for Retrieval-Augmented Generation [25.516648802281626]
Retrieval-Augmented Generation (RAG) has been an effective approach to mitigate hallucinations in large language models (LLMs) Recently, there has been a trend storing up-to-date or copyrighted data in RAG knowledge databases instead of using it for LLM training. This practice has raised concerns about Membership Inference Attacks (MIAs), which aim to detect if a specific target document is stored in the RAG system's knowledge database.
arXiv Detail & Related papers (2024-10-26T10:43:39Z)
Rag and Roll: An End-to-End Evaluation of Indirect Prompt Manipulations in LLM-based Application Frameworks [12.061098193438022]
Retrieval Augmented Generation (RAG) is a technique commonly used to equip models with out of distribution knowledge. This paper investigates the security of RAG systems against end-to-end indirect prompt manipulations.
arXiv Detail & Related papers (2024-08-09T12:26:05Z)
Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Text anonymization is crucial for sharing sensitive data while maintaining privacy. Existing techniques face the emerging challenges of re-identification attack ability of Large Language Models. This paper proposes a framework composed of three LLM-based components -- a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z)
Generating Is Believing: Membership Inference Attacks against Retrieval-Augmented Generation [9.73190366574692]
Retrieval-Augmented Generation (RAG) is a technique that mitigates issues such as hallucinations and knowledge staleness in Large Language Models (LLMs) Existing research has demonstrated potential privacy risks associated with the LLMs of RAG. We present S$2$MIA, a underlineMembership underlineInference underlineAttack that utilizes the underlineSemantic underlineSimilarity between a given sample and the content generated by the RAG system.
arXiv Detail & Related papers (2024-06-27T14:58:38Z)
"Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models [74.05368440735468]
Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs) In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases.
arXiv Detail & Related papers (2024-06-26T05:36:23Z)
Corpus Poisoning via Approximate Greedy Gradient Descent [48.5847914481222]
We propose Approximate Greedy Gradient Descent, a new attack on dense retrieval systems based on the widely used HotFlip method for generating adversarial passages. We show that our method achieves a high attack success rate on several datasets and using several retrievers, and can generalize to unseen queries and new domains.
arXiv Detail & Related papers (2024-06-07T17:02:35Z)
Certifiably Robust RAG against Retrieval Corruption [58.677292678310934]
Retrieval-augmented generation (RAG) has been shown vulnerable to retrieval corruption attacks. In this paper, we propose RobustRAG as the first defense framework against retrieval corruption attacks.
arXiv Detail & Related papers (2024-05-24T13:44:25Z)
The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG) [56.67603627046346]
Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data. In this work, we conduct empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database.
arXiv Detail & Related papers (2024-02-23T18:35:15Z)
Model Stealing Attack against Recommender System [85.1927483219819]
Some adversarial attacks have achieved model stealing attacks against recommender systems. In this paper, we constrain the volume of available target data and queries and utilize auxiliary data, which shares the item set with the target data, to promote model stealing attacks.
arXiv Detail & Related papers (2023-12-18T05:28:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.