RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service
- URL: http://arxiv.org/abs/2412.12775v1
- Date: Tue, 17 Dec 2024 10:36:52 GMT
- Title: RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service
- Authors: Yihang Cheng, Lan Zhang, Junyang Wang, Mu Yuan, Yunhao Yao,
- Abstract summary: We are the first to formally define the privacy-preserving cloud RAG service to protect the user query.<n>For privacy, we introduce $(n,epsilon)$-DistanceDP to characterize privacy leakage of the user query and the leakage inferred from relevant documents.<n>For efficiency, we limit the search range from the total documents to a small number of selected documents related to a perturbed embedding generated from $(n,epsilon)$-DistanceDP.
- Score: 10.383191657228826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval-augmented generation (RAG) improves the service quality of large language models by retrieving relevant documents from credible literature and integrating them into the context of the user query. Recently, the rise of the cloud RAG service has made it possible for users to query relevant documents conveniently. However, directly sending queries to the cloud brings potential privacy leakage. In this paper, we are the first to formally define the privacy-preserving cloud RAG service to protect the user query and propose RemoteRAG as a solution regarding privacy, efficiency, and accuracy. For privacy, we introduce $(n,\epsilon)$-DistanceDP to characterize privacy leakage of the user query and the leakage inferred from relevant documents. For efficiency, we limit the search range from the total documents to a small number of selected documents related to a perturbed embedding generated from $(n,\epsilon)$-DistanceDP, so that computation and communication costs required for privacy protection significantly decrease. For accuracy, we ensure that the small range includes target documents related to the user query with detailed theoretical analysis. Experimental results also demonstrate that RemoteRAG can resist existing embedding inversion attack methods while achieving no loss in retrieval under various settings. Moreover, RemoteRAG is efficient, incurring only $0.67$ seconds and $46.66$KB of data transmission ($2.72$ hours and $1.43$ GB with the non-optimized privacy-preserving scheme) when retrieving from a total of $10^6$ documents.
Related papers
- NeuroFilter: Privacy Guardrails for Conversational LLM Agents [50.75206727081996]
This work addresses the computational challenge of enforcing privacy for agentic Large Language Models (LLMs)<n>NeuroFilter is a guardrail framework that operationalizes contextual integrity by mapping norm violations to simple directions in the model's activation space.<n>A comprehensive evaluation across over 150,000 interactions, covering models from 7B to 70B parameters, illustrates the strong performance of NeuroFilter.
arXiv Detail & Related papers (2026-01-21T05:16:50Z) - Efficient Privacy-Preserving Retrieval Augmented Generation with Distance-Preserving Encryption [25.87368479678027]
RAG has emerged as a key technique for enhancing response quality of LLMs without high computational cost.<n>In traditional architectures, RAG services are provided by a single entity that hosts the dataset within a trusted local environment.<n>This dependence on untrusted third-party services introduces privacy risks.<n>We propose an efficient privacy-preserving RAG framework (ppRAG) tailored for untrusted cloud environments.
arXiv Detail & Related papers (2026-01-18T09:29:50Z) - Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private [21.980739918403344]
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving documents from an external corpus at inference time.<n>When this corpus contains sensitive information, unprotected RAG systems are at risk of leaking private information.<n>In this paper, we study the more practical multi-query setting and propose two DP-RAG algorithms.
arXiv Detail & Related papers (2025-11-10T21:12:32Z) - ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search [69.60882125603133]
We present ReliabilityRAG, a framework for adversarial robustness that explicitly leverages reliability information of retrieved documents.<n>Our work is a significant step towards more effective, provably robust defenses against retrieved corpus corruption in RAG.
arXiv Detail & Related papers (2025-09-27T22:36:42Z) - Privacy-Aware Decoding: Mitigating Privacy Leakage of Large Language Models in Retrieval-Augmented Generation [26.573578326262307]
Privacy-Aware Decoding (PAD) is a lightweight, inference-time defense that adaptively injects calibrated Gaussian noise into token logits during generation.<n>PAD integrates confidence-based screening to selectively protect high-risk tokens, efficient sensitivity estimation to minimize unnecessary noise, and context-aware noise calibration to balance privacy with generation quality.<n>Our work takes an important step toward mitigating privacy risks in RAG via decoding strategies, paving the way for universal and scalable privacy solutions in sensitive domains.
arXiv Detail & Related papers (2025-08-05T05:22:13Z) - Transform Before You Query: A Privacy-Preserving Approach for Vector Retrieval with Embedding Space Alignment [7.491164990682839]
STEER (textbfSecure textbfTransformed textbfEmbedding vtextbfEctortextbf Retrieval) is a private vector retrieval framework.<n>It safeguards query text privacy while maintaining the retrieval accuracy.
arXiv Detail & Related papers (2025-07-24T15:41:34Z) - MAGPIE: A dataset for Multi-AGent contextual PrIvacy Evaluation [54.410825977390274]
Existing benchmarks to evaluate contextual privacy in LLM-agents primarily assess single-turn, low-complexity tasks.<n>We first present a benchmark - MAGPIE comprising 158 real-life high-stakes scenarios across 15 domains.<n>We then evaluate the current state-of-the-art LLMs on their understanding of contextually private data and their ability to collaborate without violating user privacy.
arXiv Detail & Related papers (2025-06-25T18:04:25Z) - Safeguarding Privacy of Retrieval Data against Membership Inference Attacks: Is This Query Too Close to Home? [4.488261272565345]
Mirabel is a similarity-based MIA detection framework designed for the RAG system.<n>We show that simple detect-and-hide strategies can successfully obfuscate attackers, maintain data utility, and remain system-agnostic.
arXiv Detail & Related papers (2025-05-28T07:35:07Z) - EnronQA: Towards Personalized RAG over Private Documents [10.561751736295022]
Retrieval Augmented Generation (RAG) has become one of the most popular methods for bringing knowledge-intensive context to large language models (LLM)
Current RAG benchmarks for validating and optimizing RAG pipelines draw their corpora from public data such as Wikipedia or generic web pages.
We release the EnronQA benchmark, a dataset of 103,638 emails with 528,304 question-answer pairs across 150 different user inboxes.
arXiv Detail & Related papers (2025-05-01T03:07:30Z) - Privacy-Preserving Federated Embedding Learning for Localized Retrieval-Augmented Generation [60.81109086640437]
We propose a novel framework called Federated Retrieval-Augmented Generation (FedE4RAG)
FedE4RAG facilitates collaborative training of client-side RAG retrieval models.
We apply homomorphic encryption within federated learning to safeguard model parameters.
arXiv Detail & Related papers (2025-04-27T04:26:02Z) - MES-RAG: Bringing Multi-modal, Entity-Storage, and Secure Enhancements to RAG [65.0423152595537]
We propose MES-RAG, which enhances entity-specific query handling and provides accurate, secure, and consistent responses.
MES-RAG introduces proactive security measures that ensure system integrity by applying protections prior to data access.
Experimental results demonstrate that MES-RAG significantly improves both accuracy and recall, highlighting its effectiveness in advancing the security and utility of question-answering.
arXiv Detail & Related papers (2025-03-17T08:09:42Z) - Dataset Protection via Watermarked Canaries in Retrieval-Augmented LLMs [67.0310240737424]
We introduce a novel approach to safeguard the ownership of text datasets and effectively detect unauthorized use by the RA-LLMs.
Our approach preserves the original data completely unchanged while protecting it by inserting specifically designed canary documents into the IP dataset.
During the detection process, unauthorized usage is identified by querying the canary documents and analyzing the responses of RA-LLMs.
arXiv Detail & Related papers (2025-02-15T04:56:45Z) - Riddle Me This! Stealthy Membership Inference for Retrieval-Augmented Generation [18.098228823748617]
We present Interrogation Attack (IA), a membership inference technique targeting documents in the RAG datastore.
We demonstrate successful inference with just 30 queries while remaining stealthy.
We observe a 2x improvement in TPR@1%FPR over prior inference attacks across diverse RAG configurations.
arXiv Detail & Related papers (2025-02-01T04:01:18Z) - MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation [34.66546005629471]
Large Language Models (LLMs) are essential tools for various natural language processing tasks but often suffer from generating outdated or incorrect information.
Retrieval-Augmented Generation (RAG) addresses this issue by incorporating external, real-time information retrieval to ground LLM responses.
To tackle this problem, we propose Multi-Agent Filtering Retrieval-Augmented Generation (MAIN-RAG)
MAIN-RAG is a training-free RAG framework that leverages multiple LLM agents to collaboratively filter and score retrieved documents.
arXiv Detail & Related papers (2024-12-31T08:07:26Z) - Privacy-Preserving Retrieval Augmented Generation with Differential Privacy [25.896416088293908]
Retrieval augmented generation (RAG) assists large language models (LLMs) by directly providing relevant information from external knowledge sources.<n>RAG outputs risk leaking sensitive information from the external data source.<n>In this work, we explore RAG under differential privacy (DP), a formal guarantee of data privacy.
arXiv Detail & Related papers (2024-12-06T01:20:16Z) - RA-WEBs: Remote Attestation for WEB services [1.3445048453161086]
TEEs offer a promising solution by creating secure environments that protect data and code from such threats.
One key feature is Remote verification (RA), which enables integrity verification of a TEE.
We propose $textttRA-WEBs$ ($textbfR$emote $textbfA$ttestation for $textbfWeb$ $textbfs$ervices), a novel RA protocol designed for high compatibility with the current web ecosystem.
arXiv Detail & Related papers (2024-11-02T18:46:58Z) - Optimizing Query Generation for Enhanced Document Retrieval in RAG [53.10369742545479]
Large Language Models (LLMs) excel in various language tasks but they often generate incorrect information.
Retrieval-Augmented Generation (RAG) aims to mitigate this by using document retrieval for accurate responses.
arXiv Detail & Related papers (2024-07-17T05:50:32Z) - $\texttt{MixGR}$: Enhancing Retriever Generalization for Scientific Domain through Complementary Granularity [88.78750571970232]
This paper introduces $texttMixGR$, which improves dense retrievers' awareness of query-document matching.
$texttMixGR$ fuses various metrics based on granularities to a united score that reflects a comprehensive query-document similarity.
arXiv Detail & Related papers (2024-07-15T13:04:09Z) - Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data [51.41288763521186]
Retrieval-augmented generation (RAG) enhances the outputs of language models by integrating relevant information retrieved from external knowledge sources.
RAG systems may face severe privacy risks when retrieving private data.
We propose using synthetic data as a privacy-preserving alternative for the retrieval data.
arXiv Detail & Related papers (2024-06-20T22:53:09Z) - Machine Against the RAG: Jamming Retrieval-Augmented Generation with Blocker Documents [17.95339197094059]
Retrieval-augmented generation (RAG) systems respond to queries by retrieving relevant documents from a knowledge database, then generating an answer by applying an LLM to the retrieved documents.
We demonstrate that RAG systems that operate on databases with untrusted content are vulnerable to a new class of denial-of-service attacks we call jamming.
arXiv Detail & Related papers (2024-06-09T17:55:55Z) - The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented
Generation (RAG) [56.67603627046346]
Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data.
In this work, we conduct empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database.
arXiv Detail & Related papers (2024-02-23T18:35:15Z) - Local Differentially Private Heavy Hitter Detection in Data Streams with Bounded Memory [31.652076018162507]
We present a novel framework HG-LDP to achieve accurate Top-$k$ item detection at bounded memory expense, while providing rigorous local differential privacy (LDP) protection.
We conduct comprehensive experiments on both synthetic and real-world datasets to show that the proposed advanced schemes achieve a superior accuracy-privacy-memory efficiency'' tradeoff.
arXiv Detail & Related papers (2023-11-27T18:28:15Z) - A Randomized Approach for Tight Privacy Accounting [63.67296945525791]
We propose a new differential privacy paradigm called estimate-verify-release (EVR)
EVR paradigm first estimates the privacy parameter of a mechanism, then verifies whether it meets this guarantee, and finally releases the query output.
Our empirical evaluation shows the newly proposed EVR paradigm improves the utility-privacy tradeoff for privacy-preserving machine learning.
arXiv Detail & Related papers (2023-04-17T00:38:01Z) - LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text
Retrieval [55.097573036580066]
Experimental results show that LaPraDoR achieves state-of-the-art performance compared with supervised dense retrieval models.
Compared to re-ranking, our lexicon-enhanced approach can be run in milliseconds (22.5x faster) while achieving superior performance.
arXiv Detail & Related papers (2022-03-11T18:53:12Z) - Generative Adversarial User Privacy in Lossy Single-Server Information
Retrieval [18.274573259364026]
We consider the problem of information retrieval from a dataset of files stored on a single server under both a user distortion and a user privacy constraint.
Specifically, a user requesting a file from the dataset should be able to reconstruct the requested file with a prescribed distortion.
In addition, the identity of the requested file should be kept private from the server with a prescribed privacy level.
arXiv Detail & Related papers (2020-12-07T18:31:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.