Machine Against the RAG: Jamming Retrieval-Augmented Generation with Blocker Documents
- URL: http://arxiv.org/abs/2406.05870v3
- Date: Mon, 20 Jan 2025 18:01:06 GMT
- Title: Machine Against the RAG: Jamming Retrieval-Augmented Generation with Blocker Documents
- Authors: Avital Shafran, Roei Schuster, Vitaly Shmatikov,
- Abstract summary: Retrieval-augmented generation (RAG) systems respond to queries by retrieving relevant documents from a knowledge database.
We demonstrate that RAG systems that operate on databases with untrusted content are vulnerable to denial-of-service attacks we call jamming.
- Score: 17.95339197094059
- License:
- Abstract: Retrieval-augmented generation (RAG) systems respond to queries by retrieving relevant documents from a knowledge database and applying an LLM to the retrieved documents. We demonstrate that RAG systems that operate on databases with untrusted content are vulnerable to denial-of-service attacks we call jamming. An adversary can add a single ``blocker'' document to the database that will be retrieved in response to a specific query and result in the RAG system not answering this query - ostensibly because it lacks the relevant information or because the answer is unsafe. We describe and measure the efficacy of several methods for generating blocker documents, including a new method based on black-box optimization. This method (1) does not rely on instruction injection, (2) does not require the adversary to know the embedding or LLM used by the target RAG system, and (3) does not rely on an auxiliary LLM. We evaluate jamming attacks on several LLMs and embeddings and demonstrate that the existing safety metrics for LLMs do not capture their vulnerability to jamming. We then discuss defenses against blocker documents.
Related papers
- Dataset Protection via Watermarked Canaries in Retrieval-Augmented LLMs [67.0310240737424]
We introduce a novel approach to safeguard the ownership of text datasets and effectively detect unauthorized use by the RA-LLMs.
Our approach preserves the original data completely unchanged while protecting it by inserting specifically designed canary documents into the IP dataset.
During the detection process, unauthorized usage is identified by querying the canary documents and analyzing the responses of RA-LLMs.
arXiv Detail & Related papers (2025-02-15T04:56:45Z) - Riddle Me This! Stealthy Membership Inference for Retrieval-Augmented Generation [18.098228823748617]
We present Interrogation Attack (IA), a membership inference technique targeting documents in the RAG datastore.
We demonstrate successful inference with just 30 queries while remaining stealthy.
We observe a 2x improvement in TPR@1%FPR over prior inference attacks across diverse RAG configurations.
arXiv Detail & Related papers (2025-02-01T04:01:18Z) - mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA [78.45521005703958]
multimodal Retrieval-Augmented Generation (mRAG) is naturally introduced to provide MLLMs with comprehensive and up-to-date knowledge.
We propose a novel framework called textbfRetrieval-textbfReftextbfAugmented textbfGeneration (mR$2$AG) which achieves adaptive retrieval and useful information localization.
mR$2$AG significantly outperforms state-of-the-art MLLMs on INFOSEEK and Encyclopedic-VQA
arXiv Detail & Related papers (2024-11-22T16:15:50Z) - Mask-based Membership Inference Attacks for Retrieval-Augmented Generation [25.516648802281626]
Retrieval-Augmented Generation (RAG) has been an effective approach to mitigate hallucinations in large language models (LLMs)
Recently, there has been a trend storing up-to-date or copyrighted data in RAG knowledge databases instead of using it for LLM training.
This practice has raised concerns about Membership Inference Attacks (MIAs), which aim to detect if a specific target document is stored in the RAG system's knowledge database.
arXiv Detail & Related papers (2024-10-26T10:43:39Z) - ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems [2.8692611791027893]
Retrieval-Augmented Generation (RAG) systems generate inaccurate responses due to the retrieval of irrelevant or loosely related information.
We propose ChunkRAG, a framework that enhances RAG systems by evaluating and filtering retrieved information at the chunk level.
arXiv Detail & Related papers (2024-10-25T14:07:53Z) - GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning [79.07152553060601]
Existing methods for enhancing the safety of large language models (LLMs) are not directly transferable to LLM-powered agents.
We propose GuardAgent, the first LLM agent as a guardrail to other LLM agents.
GuardAgent comprises two steps: 1) creating a task plan by analyzing the provided guard requests, and 2) generating guardrail code based on the task plan and executing the code by calling APIs or using external engines.
arXiv Detail & Related papers (2024-06-13T14:49:26Z) - Phantom: General Trigger Attacks on Retrieval Augmented Language Generation [30.63258739968483]
Retrieval Augmented Generation (RAG) expands the capabilities of modern large language models (LLMs)
We propose new attack vectors that allow an adversary to inject a single malicious document into a RAG system's knowledge base, and mount a backdoor poisoning attack.
We demonstrate our attacks on multiple LLM architectures, including Gemma, Vicuna, and Llama, and show that they transfer to GPT-3.5 Turbo and GPT-4.
arXiv Detail & Related papers (2024-05-30T21:19:24Z) - Is My Data in Your Retrieval Database? Membership Inference Attacks Against Retrieval Augmented Generation [0.9217021281095907]
We introduce an efficient and easy-to-use method for conducting a Membership Inference Attack (MIA) against RAG systems.
We demonstrate the effectiveness of our attack using two benchmark datasets and multiple generative models.
Our findings highlight the importance of implementing security countermeasures in deployed RAG systems.
arXiv Detail & Related papers (2024-05-30T19:46:36Z) - PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval [76.50690734636477]
We propose PromptReps, which combines the advantages of both categories: no need for training and the ability to retrieve from the whole corpus.
The retrieval system harnesses both dense text embedding and sparse bag-of-words representations.
arXiv Detail & Related papers (2024-04-29T04:51:30Z) - Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs [59.596335292426105]
This paper collects the first open-source dataset to evaluate safeguards in large language models.
We train several BERT-like classifiers to achieve results comparable with GPT-4 on automatic safety evaluation.
arXiv Detail & Related papers (2023-08-25T14:02:12Z) - Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline.
This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.