Related papers: ReFilter: Improving Robustness of Retrieval-Augmented Generation via Gated Filter

ReFilter: Improving Robustness of Retrieval-Augmented Generation via Gated Filter

URL: http://arxiv.org/abs/2602.12709v1
Date: Fri, 13 Feb 2026 08:25:26 GMT
Title: ReFilter: Improving Robustness of Retrieval-Augmented Generation via Gated Filter
Authors: Yixin Chen, Ying Xiong, Shangyu Wu, Xiangrui Ke, Nan Guan, Chun Jason Xue,
Abstract summary: We propose a novel latent-based fusion framework that performs token-level filtering and fusion.<n>ReFilter consists of three key components: a context encoder for encoding context features, a gated filter for weighting each token, and a token fusion module.<n>Our experiments show that ReFilter consistently achieves the best average performance under both in-domain adaptation and out-of-domain transfer.
Score: 21.74343337071446
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Retrieval-augmented generation (RAG) has become a dominant paradigm for grounding large language models (LLMs) with external evidence in knowledge-intensive question answering. A core design choice is how to fuse retrieved samples into the LLMs, where existing internal fusion approaches broadly fall into query-based fusion, parametric fusion, and latent-based fusion. Despite their effectiveness at modest retrieval scales, these methods often fail to scale gracefully as the number of retrieved candidates k increases: Larger k improves evidence coverage, yet realistic top-k retrieval inevitably contains irrelevant or redundant content and increases the inference cost. To address these limitations, we propose ReFilter, a novel latent-based fusion framework that performs token-level filtering and fusion. ReFilter consists of three key components: a context encoder for encoding context features, a gated filter for weighting each token, and a token fusion module for integrating the weighted token feature into the LLM's hidden states. Our experiments across four general-domain QA benchmarks show that ReFilter consistently achieves the best average performance under both in-domain adaptation and out-of-domain transfer. ReFilter further generalizes to five biomedical QA benchmarks in zero-shot transfer without domain fine-tuning, reaching 70.01% average accuracy with Qwen2.5-14B-Instruct.

Related papers

Succeeding at Scale: Automated Multi-Retriever Fusion and Query-Side Adaptation for Multi-Tenant Search [22.080200394842123]
DevRev Search is a passage retrieval benchmark for technical customer support constructed through a fully automatic pipeline.<n>We propose a practical textbfIndex-Preserving Adaptation strategy, by fine-tuning only the query encoder via Low-Rank Adaptation.<n>Our experiments on DevRev Search and SciFact demonstrate that targeting specific transformer layers in the query encoder yields optimal quality-efficiency trade-offs.
arXiv Detail & Related papers (2026-01-08T06:44:40Z)
Towards Global Retrieval Augmented Generation: A Benchmark for Corpus-Level Reasoning [50.27838512822097]
We introduce GlobalQA, the first benchmark specifically designed to evaluate global RAG capabilities.<n>We propose GlobalRAG, a multi-tool collaborative framework that preserves structural coherence through chunk-level retrieval.<n>On the Qwen2.5-14B model, GlobalRAG achieves 6.63 F1 compared to the strongest baseline's 1.51 F1.
arXiv Detail & Related papers (2025-10-30T07:29:14Z)
Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning [48.46951981642895]
We propose WebFilter, a novel RAG framework that generates source-restricted queries and filters out unreliable content.<n>We show that WebFilter improves answer quality and retrieval precision, outperforming existing RAG methods on both in-domain and out-of-domain benchmarks.
arXiv Detail & Related papers (2025-08-11T13:08:37Z)
CostFilter-AD: Enhancing Anomaly Detection through Matching Cost Filtering [64.24198178156627]
Unsupervised anomaly detection (UAD) seeks to localize the anomaly mask of an input image with respect to normal samples.<n>We introduce the concept of cost filtering, borrowed from classical matching tasks, into the UAD problem.<n>We propose a cost volume filtering network, guided by the input observation as an attention query across multiple feature layers.
arXiv Detail & Related papers (2025-05-02T14:52:34Z)
Beyond Prompting: An Efficient Embedding Framework for Open-Domain Question Answering [22.121850498642008]
Large language models have recently pushed open domain question answering to new frontiers.<n> prevailing retriever-reader pipelines often depend on multiple rounds of prompt level instructions.<n>We propose EmbQA, an embedding-level framework that enhances both the retriever and the reader.
arXiv Detail & Related papers (2025-03-03T14:41:35Z)
MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation [34.66546005629471]
Large Language Models (LLMs) are essential tools for various natural language processing tasks but often suffer from generating outdated or incorrect information.<n>Retrieval-Augmented Generation (RAG) addresses this issue by incorporating external, real-time information retrieval to ground LLM responses.<n>To tackle this problem, we propose Multi-Agent Filtering Retrieval-Augmented Generation (MAIN-RAG)<n>MAIN-RAG is a training-free RAG framework that leverages multiple LLM agents to collaboratively filter and score retrieved documents.
arXiv Detail & Related papers (2024-12-31T08:07:26Z)
ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems [2.8692611791027893]
Retrieval-Augmented Generation (RAG) systems generate inaccurate responses due to the retrieval of irrelevant or loosely related information.<n>We propose ChunkRAG, a framework that enhances RAG systems by evaluating and filtering retrieved information at the chunk level.
arXiv Detail & Related papers (2024-10-25T14:07:53Z)
Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement [19.277560848076984]
Two-stage selection strategies result in scale bias and redundancy due to mismatch between selected queries and objects. We propose hierarchical salience filtering refinement, which performs transformer encoding only on filtered discriminative queries. The proposed Salience DETR achieves significant improvements of +4.0% AP, +0.2% AP, +4.4% AP on three challenging task-specific detection datasets.
arXiv Detail & Related papers (2024-03-24T13:01:57Z)
BlendFilter: Advancing Retrieval-Augmented Large Language Models via Query Generation Blending and Knowledge Filtering [58.403898834018285]
BlendFilter is a novel approach that elevates retrieval-augmented Large Language Models by integrating query generation blending with knowledge filtering. We conduct extensive experiments on three open-domain question answering benchmarks, and the findings clearly indicate that our innovative BlendFilter surpasses state-of-the-art baselines significantly.
arXiv Detail & Related papers (2024-02-16T23:28:02Z)
KG-FiD: Infusing Knowledge Graph in Fusion-in-Decoder for Open-Domain Question Answering [68.00631278030627]
We propose a novel method KG-FiD, which filters noisy passages by leveraging the structural relationship among the retrieved passages with a knowledge graph. We show that KG-FiD can improve vanilla FiD by up to 1.5% on answer exact match score and achieve comparable performance with FiD with only 40% of computation cost.
arXiv Detail & Related papers (2021-10-08T18:39:59Z)
Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost. Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors. We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.