Related papers: Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home?

Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home?

URL: http://arxiv.org/abs/2505.22061v1
Date: Wed, 28 May 2025 07:35:07 GMT
Title: Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home?
Authors: Yujin Choi, Youngjoo Park, Junyoung Byun, Jaewook Lee, Jinseong Park,
Abstract summary: Mirabel is a similarity-based MIA detection framework designed for the RAG system.<n>We show that simple detect-and-hide strategies can successfully obfuscate attackers, maintain data utility, and remain system-agnostic.
Score: 4.488261272565345
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retrieval-augmented generation (RAG) mitigates the hallucination problem in large language models (LLMs) and has proven effective for specific, personalized applications. However, passing private retrieved documents directly to LLMs introduces vulnerability to membership inference attacks (MIAs), which try to determine whether the target datum exists in the private external database or not. Based on the insight that MIA queries typically exhibit high similarity to only one target document, we introduce Mirabel, a similarity-based MIA detection framework designed for the RAG system. With the proposed Mirabel, we show that simple detect-and-hide strategies can successfully obfuscate attackers, maintain data utility, and remain system-agnostic. We experimentally prove its detection and defense against various state-of-the-art MIA methods and its adaptability to existing private RAG systems.

Related papers

Fine-Grained Privacy Extraction from Retrieval-Augmented Generation Systems via Knowledge Asymmetry Exploitation [15.985529058573912]
Retrieval-augmented generation (RAG) systems enhance large language models (LLMs) by integrating external knowledge bases.<n>Existing privacy attacks on RAG systems can trigger data leakage but often fail to accurately isolate knowledge-base-derived sentences within mixed responses.<n>This paper presents a novel black-box attack framework that exploits knowledge asymmetry between RAG and standard LLMs to achieve fine-grained privacy extraction.
arXiv Detail & Related papers (2025-07-31T03:50:16Z)
DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective [59.66984417026933]
We introduce a novel taxonomy, classifying existing methods based on their reliance on internal features (IF) (inherent to the data) versus external features (EF) (artificially introduced for auditing)<n>We formulate two primary attack types: evasion attacks, designed to conceal the use of a dataset, and forgery attacks, intending to falsely implicate an unused dataset.<n>Building on the understanding of existing methods and attack objectives, we further propose systematic attack strategies: decoupling, removal, and detection for evasion; adversarial example-based methods for forgery.<n>Our benchmark, DATABench, comprises 17 evasion attacks, 5 forgery attacks, and 9
arXiv Detail & Related papers (2025-07-08T03:07:15Z)
MrM: Black-Box Membership Inference Attacks against Multimodal RAG Systems [31.53306157650065]
Multimodal retrieval-augmented generation (RAG) systems enhance large vision-language models by integrating cross-modal knowledge.<n>These knowledge databases may contain sensitive information that requires privacy protection.<n>MrM is the first black-box MIA framework targeted at multimodal RAG systems.
arXiv Detail & Related papers (2025-06-09T03:48:50Z)
Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation [60.81109086640437]
We propose a novel framework called Federated Retrieval-Augmented Generation (FedE4RAG)<n>FedE4RAG facilitates collaborative training of client-side RAG retrieval models.<n>We apply homomorphic encryption within federated learning to safeguard model parameters.
arXiv Detail & Related papers (2025-04-27T04:26:02Z)
Mask-based Membership Inference Attacks for Retrieval-Augmented Generation [25.516648802281626]
Retrieval-Augmented Generation (RAG) has been an effective approach to mitigate hallucinations in large language models (LLMs)<n>Recently, there has been a trend storing up-to-date or copyrighted data in RAG knowledge databases instead of using it for LLM training.<n>This practice has raised concerns about Membership Inference Attacks (MIAs), which aim to detect if a specific target document is stored in the RAG system's knowledge database.
arXiv Detail & Related papers (2024-10-26T10:43:39Z)
Generating Is Believing: Membership Inference Attacks against Retrieval-Augmented Generation [9.73190366574692]
Retrieval-Augmented Generation (RAG) is a technique that mitigates issues such as hallucinations and knowledge staleness in Large Language Models (LLMs) Existing research has demonstrated potential privacy risks associated with the LLMs of RAG. We present S$2$MIA, a underlineMembership underlineInference underlineAttack that utilizes the underlineSemantic underlineSimilarity between a given sample and the content generated by the RAG system.
arXiv Detail & Related papers (2024-06-27T14:58:38Z)
Jailbreaking as a Reward Misspecification Problem [80.52431374743998]
We propose a novel perspective that attributes this vulnerability to reward misspecification during the alignment process.<n>We introduce a metric ReGap to quantify the extent of reward misspecification and demonstrate its effectiveness.<n>We present ReMiss, a system for automated red teaming that generates adversarial prompts in a reward-misspecified space.
arXiv Detail & Related papers (2024-06-20T15:12:27Z)
Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation [0.9217021281095907]
We introduce an efficient and easy-to-use method for conducting a Membership Inference Attack (MIA) against RAG systems.<n>We demonstrate the effectiveness of our attack using two benchmark datasets and multiple generative models.<n>Our findings highlight the importance of implementing security countermeasures in deployed RAG systems.
arXiv Detail & Related papers (2024-05-30T19:46:36Z)
EmInspector: Combating Backdoor Attacks in Federated Self-Supervised Learning Through Embedding Inspection [53.25863925815954]
Federated self-supervised learning (FSSL) has emerged as a promising paradigm that enables the exploitation of clients' vast amounts of unlabeled data. While FSSL offers advantages, its susceptibility to backdoor attacks has not been investigated. We propose the Embedding Inspector (EmInspector) that detects malicious clients by inspecting the embedding space of local models.
arXiv Detail & Related papers (2024-05-21T06:14:49Z)
The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG) [56.67603627046346]
Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data. In this work, we conduct empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database.
arXiv Detail & Related papers (2024-02-23T18:35:15Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Membership Inference Attacks against Synthetic Data through Overfitting Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution. We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.