GRADA: Graph-based Reranker against Adversarial Documents Attack
- URL: http://arxiv.org/abs/2505.07546v1
- Date: Mon, 12 May 2025 13:27:35 GMT
- Title: GRADA: Graph-based Reranker against Adversarial Documents Attack
- Authors: Jingjie Zheng, Aryo Pradipta Gema, Giwon Hong, Xuanli He, Pasquale Minervini, Youcheng Sun, Qiongkai Xu,
- Abstract summary: Adversarial Document Attacks manipulate the retrieval process by introducing documents that are adversarial yet semantically similar to the query.<n>We propose a Graph-based Reranking against Adversarial Document Attacks framework aiming at preserving retrieval quality while significantly reducing the success of adversaries.
- Score: 24.95583601804124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval Augmented Generation (RAG) frameworks improve the accuracy of large language models (LLMs) by integrating external knowledge from retrieved documents, thereby overcoming the limitations of models' static intrinsic knowledge. However, these systems are susceptible to adversarial attacks that manipulate the retrieval process by introducing documents that are adversarial yet semantically similar to the query. Notably, while these adversarial documents resemble the query, they exhibit weak similarity to benign documents in the retrieval set. Thus, we propose a simple yet effective Graph-based Reranking against Adversarial Document Attacks (GRADA) framework aiming at preserving retrieval quality while significantly reducing the success of adversaries. Our study evaluates the effectiveness of our approach through experiments conducted on five LLMs: GPT-3.5-Turbo, GPT-4o, Llama3.1-8b, Llama3.1-70b, and Qwen2.5-7b. We use three datasets to assess performance, with results from the Natural Questions dataset demonstrating up to an 80% reduction in attack success rates while maintaining minimal loss in accuracy.
Related papers
- Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence [56.09494651178128]
Retrieval models are commonly used in Information Retrieval (IR) applications, such as Retrieval-Augmented Generation (RAG)<n>We show that retrievers often rely on superficial patterns like over-prioritizing document beginnings, shorter documents, repeated entities, and literal matches.<n>We show that these biases have direct consequences for downstream applications like RAG, where retrieval-preferred documents can mislead LLMs.
arXiv Detail & Related papers (2025-03-06T23:23:13Z) - Worse than Zero-shot? A Fact-Checking Dataset for Evaluating the Robustness of RAG Against Misleading Retrievals [3.9139847342664864]
We introduce RAGuard, a fact-checking dataset designed to evaluate the robustness of RAG systems against misleading retrievals.<n> RAGuard categorizes retrieved evidence into three types: supporting, misleading, and irrelevant.<n>Our benchmark experiments reveal that when exposed to misleading retrievals, all tested LLM-powered RAG systems perform worse than their zero-shot baselines.
arXiv Detail & Related papers (2025-02-22T05:50:15Z) - Riddle Me This! Stealthy Membership Inference for Retrieval-Augmented Generation [18.098228823748617]
We present Interrogation Attack (IA), a membership inference technique targeting documents in the RAG datastore.<n>We demonstrate successful inference with just 30 queries while remaining stealthy.<n>We observe a 2x improvement in TPR@1%FPR over prior inference attacks across diverse RAG configurations.
arXiv Detail & Related papers (2025-02-01T04:01:18Z) - Unsupervised dense retrieval with conterfactual contrastive learning [16.679649921935482]
We propose to improve the robustness of dense retrieval models by enhancing their sensitivity of fine-graned relevance signals.<n>A model achieving sensitivity in this context should exhibit high variances when documents' key passages determining their relevance to queries have been modified.<n>Motivated by causality and counterfactual analysis, we propose a series of counterfactual regularization methods.
arXiv Detail & Related papers (2024-12-30T07:01:34Z) - JudgeRank: Leveraging Large Language Models for Reasoning-Intensive Reranking [81.88787401178378]
We introduce JudgeRank, a novel agentic reranker that emulates human cognitive processes when assessing document relevance.
We evaluate JudgeRank on the reasoning-intensive BRIGHT benchmark, demonstrating substantial performance improvements over first-stage retrieval methods.
In addition, JudgeRank performs on par with fine-tuned state-of-the-art rerankers on the popular BEIR benchmark, validating its zero-shot generalization capability.
arXiv Detail & Related papers (2024-10-31T18:43:12Z) - LLMs Can Patch Up Missing Relevance Judgments in Evaluation [56.51461892988846]
We use large language models (LLMs) to automatically label unjudged documents.
We simulate scenarios with varying degrees of holes by randomly dropping relevant documents from the relevance judgment in TREC DL tracks.
Our method achieves a Kendall tau correlation of 0.87 and 0.92 on an average for Vicuna-7B and GPT-3.5 Turbo respectively.
arXiv Detail & Related papers (2024-05-08T00:32:19Z) - Outlier Robust Adversarial Training [57.06824365801612]
We introduce Outlier Robust Adversarial Training (ORAT) in this work.
ORAT is based on a bi-level optimization formulation of adversarial training with a robust rank-based loss function.
We show that the learning objective of ORAT satisfies the $mathcalH$-consistency in binary classification, which establishes it as a proper surrogate to adversarial 0/1 loss.
arXiv Detail & Related papers (2023-09-10T21:36:38Z) - Defense of Adversarial Ranking Attack in Text Retrieval: Benchmark and
Baseline via Detection [12.244543468021938]
This paper introduces two types of detection tasks for adversarial documents.
A benchmark dataset is established to facilitate the investigation of adversarial ranking defense.
A comprehensive investigation of the performance of several detection baselines is conducted.
arXiv Detail & Related papers (2023-07-31T16:31:24Z) - Incorporating Relevance Feedback for Information-Seeking Retrieval using
Few-Shot Document Re-Ranking [56.80065604034095]
We introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant.
To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario.
arXiv Detail & Related papers (2022-10-19T16:19:37Z) - A Unified Evaluation of Textual Backdoor Learning: Frameworks and
Benchmarks [72.7373468905418]
We develop an open-source toolkit OpenBackdoor to foster the implementations and evaluations of textual backdoor learning.
We also propose CUBE, a simple yet strong clustering-based defense baseline.
arXiv Detail & Related papers (2022-06-17T02:29:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.