Learning to Extract Rational Evidence via Reinforcement Learning for Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2507.15586v4
- Date: Wed, 30 Jul 2025 11:51:25 GMT
- Title: Learning to Extract Rational Evidence via Reinforcement Learning for Retrieval-Augmented Generation
- Authors: Xinping Zhao, Shouzheng Huang, Yan Zhong, Xinshuo Hu, Meishan Zhang, Baotian Hu, Min Zhang,
- Abstract summary: Retrieval-Augmented Generation (RAG) effectively improves the accuracy of Large Language Models (LLMs)<n>Previous methods extract evidence straightforwardly without explicit thinking, which risks filtering out key clues and struggles with generalization.<n>We propose Evi Omni, which learns to extract rational evidence by (1) explicitly reasoning to identify potential cues within retrieval contents first, and then (2) consciously extracting to avoid omitting any key cues helpful for answering questions.
- Score: 37.47571308389908
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-Augmented Generation (RAG) effectively improves the accuracy of Large Language Models (LLMs). However, retrieval noises significantly impact the quality of LLMs' generation, necessitating the development of denoising mechanisms. Previous methods extract evidence straightforwardly without explicit thinking, which risks filtering out key clues and struggles with generalization. To this end, we propose EviOmni, which learns to extract rational evidence by (1) explicitly reasoning to identify potential cues within retrieval contents first, and then (2) consciously extracting to avoid omitting any key cues helpful for answering questions. Specifically, we frame evidence reasoning and evidence extraction into one unified response for end-to-end training; apply knowledge token masks for disentanglement to derive reasoning-based and extraction-based answers; and devise three types of verifiable reward functions, including answer, length, and format, to update the model via the policy optimization algorithm. Extensive experiments on three benchmark datasets show the effectiveness of EviOmni, providing compact and high-quality evidence, improving the accuracy of downstream tasks, and promoting effective application in online RAG systems.
Related papers
- ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented Generation [82.28147821286709]
We propose ClueAnchor, a novel framework for enhancing Retrieval-Augmented Generation (RAG)<n>ClueAnchor extracts key clues from retrieved content and generates multiple reasoning paths based on different knowledge configurations.<n>Experiments show that ClueAnchor significantly outperforms prior RAG baselines in reasoning completeness and robustness.
arXiv Detail & Related papers (2025-05-30T09:18:08Z) - Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG [51.120170062795566]
We propose Divide-Then-Align (DTA) to endow RAG systems with the ability to respond with "I don't know" when the query is out of the knowledge boundary.<n>DTA balances accuracy with appropriate abstention, enhancing the reliability and trustworthiness of retrieval-augmented systems.
arXiv Detail & Related papers (2025-05-27T08:21:21Z) - Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems through Benign Queries [27.665853244467463]
We introduce Implicit Knowledge Extraction Attack (IKEA), which conducts knowledge extraction on RAG systems through benign queries.<n>IKEA first leverages anchor concepts to generate queries with the natural appearance, and then designs two mechanisms to lead to anchor concept thoroughly 'explore' the RAG's privacy knowledge.<n>Experiments demonstrate IKEA's effectiveness under various defenses, surpassing baselines by over 80% in extraction efficiency and 90% in attack success rate.
arXiv Detail & Related papers (2025-05-21T12:04:42Z) - Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards [67.86091419220816]
Large Language Models (LLMs) show great promise in complex reasoning.<n>A prevalent issue is superficial self-reflection'', where models fail to robustly verify their own outputs.<n>We introduce RISE (Reinforcing Reasoning with Self-Verification), a novel online RL framework designed to tackle this.
arXiv Detail & Related papers (2025-05-19T17:59:31Z) - Self-Routing RAG: Binding Selective Retrieval with Knowledge Verbalization [97.72503890388866]
We propose Self-Routing RAG (SR-RAG), a novel framework that binds selective retrieval with knowledge verbalization.<n>SR-RAG enables an LLM to dynamically decide between external retrieval and verbalizing its own parametric knowledge.<n>We introduce dynamic knowledge source inference via nearest neighbor search to improve the accuracy of knowledge source decision.
arXiv Detail & Related papers (2025-04-01T17:59:30Z) - DeepRAG: Thinking to Retrieve Step by Step for Large Language Models [92.87532210660456]
We propose DeepRAG, a framework that models retrieval-augmented reasoning as a Markov Decision Process (MDP)<n>By iteratively decomposing queries, DeepRAG dynamically determines whether to retrieve external knowledge or rely on parametric reasoning at each step.<n> Experiments show that DeepRAG improves retrieval efficiency and boosts answer accuracy by 26.4%, demonstrating its effectiveness in enhancing retrieval-augmented reasoning.
arXiv Detail & Related papers (2025-02-03T08:22:45Z) - SEER: Self-Aligned Evidence Extraction for Retrieval-Augmented Generation [21.823931225182115]
We propose a model-based evidence extraction learning framework, SEER, to optimize a vanilla model as an evidence extractor.
Our method largely improves the final RAG performance, enhances the faithfulness, helpfulness, and conciseness of the extracted evidence, and reduces the evidence length by 9.25 times.
arXiv Detail & Related papers (2024-10-15T06:26:24Z) - InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales [14.655518998487237]
We propose InstructRAG, where LMs explicitly learn the denoising process through self-synthesized rationales.<n>InstructRAG requires no additional supervision, allows for easier verification of the predicted answers.<n>Experiments show InstructRAG consistently outperforms existing RAG methods in both training-free and trainable scenarios.
arXiv Detail & Related papers (2024-06-19T15:25:29Z) - Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG)
Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection.
It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.