DyKnow-RAG: Dynamic Knowledge Utilization Reinforcement Framework for Noisy Retrieval-Augmented Generation in E-commerce Search Relevance
- URL: http://arxiv.org/abs/2510.11122v1
- Date: Mon, 13 Oct 2025 08:08:59 GMT
- Title: DyKnow-RAG: Dynamic Knowledge Utilization Reinforcement Framework for Noisy Retrieval-Augmented Generation in E-commerce Search Relevance
- Authors: Tingqiao Xu, Shaowei Yao, Chenhe Dong, Yiming Jin, Zerui Huang, Dan Ou, Haihong Tang,
- Abstract summary: DyKnow-RAG is a dynamic noisy-RAG framework built on Group Relative Policy Optimization.<n>It trains two rollout groups (no external context vs a single retrieved chunk) and applies posterior-driven inter-group advantage scaling.<n>It is deployed in Taobao's production relevance system, serving live traffic.
- Score: 7.605150700675235
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurately modeling query-item relevance drives e-commerce ranking, yet long-tail, knowledge-heavy, and fast-evolving queries exceed parametric LLM coverage. External context (reviews, attribute encyclopedias, UGC) can help but is noisy, and single-pass latency and cost forbid any clean-then-summarize step. The model must, per query, judge relevance and decide whether to use, partially use, or ignore the context. DyKnow-RAG is a dynamic noisy-RAG framework built on Group Relative Policy Optimization. It trains two rollout groups (no external context vs a single retrieved chunk) and applies posterior-driven inter-group advantage scaling that adaptively reweights their contributions by the per-query correctness gap. This teaches when to trust retrieval versus fall back to parametric knowledge, without process labels, value networks, or extra inference passes, preserving single-pass, single-chunk deployment under production latency. Training combines: (1) supervised initialization with a structured rationale that explicitly records the context-usage decision; (2) an RL pool prioritized by SFT uncertainty to focus where context choice is most consequential; and (3) an optional lightweight DPO warm start to stabilize with-context calibration. Under a unified retrieval/index and fixed latency budget, DyKnow-RAG outperforms SFT, DPO, and vanilla GRPO in offline tests, and delivers consistent lifts on GSB, Query Goodrate, and Item Goodrate in Taobao A/B testing. It is deployed in Taobao's production relevance system, serving live traffic. To our knowledge, it is among the first single-pass RAG solutions for e-commerce relevance, turning noisy external signals into reliable gains without added online complexity.
Related papers
- Adaptive Retrieval helps Reasoning in LLMs -- but mostly if it's not used [19.370220750406755]
Large Language Models (LLMs) often falter in complex reasoning tasks due to their static, parametric knowledge.<n>This work explores a fundamental principle for enhancing generative models: treating retrieval as a form of dynamic in-context learning.
arXiv Detail & Related papers (2026-02-06T21:48:26Z) - TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework [62.66056331998838]
TeaRAG is a token-efficient agentic RAG framework capable of compressing both retrieval content and reasoning steps.<n>Our reward function evaluates the knowledge sufficiency by a knowledge matching mechanism, while penalizing excessive reasoning steps.
arXiv Detail & Related papers (2025-11-07T16:08:34Z) - Rethinking On-policy Optimization for Query Augmentation [49.87723664806526]
We present the first systematic comparison of prompting-based and RL-based query augmentation across diverse benchmarks.<n>We introduce a novel hybrid method, On-policy Pseudo-document Query Expansion (OPQE), which learns to generate a pseudo-document that maximizes retrieval performance.
arXiv Detail & Related papers (2025-10-20T04:16:28Z) - Learning to Route: A Rule-Driven Agent Framework for Hybrid-Source Retrieval-Augmented Generation [55.47971671635531]
Large Language Models (LLMs) have shown remarkable performance on general Question Answering (QA)<n>Retrieval-Augmented Generation (RAG) addresses this limitation by enriching LLMs with external knowledge.<n>Existing systems primarily rely on unstructured documents, while largely overlooking relational databases.
arXiv Detail & Related papers (2025-09-30T22:19:44Z) - Fishing for Answers: Exploring One-shot vs. Iterative Retrieval Strategies for Retrieval Augmented Generation [11.180502261031789]
Retrieval-Augmented Generation (RAG) based on Large Language Models (LLMs) is a powerful solution to understand and query the industry's closed-source documents.<n>However, basic RAG often struggles with complex QA tasks in legal and regulatory domains.<n>We explore two strategies to improve evidence coverage and answer quality.
arXiv Detail & Related papers (2025-09-05T05:44:50Z) - RCR-Router: Efficient Role-Aware Context Routing for Multi-Agent LLM Systems with Structured Memory [57.449129198822476]
RCR is a role-aware context routing framework for multi-agent large language model (LLM) systems.<n>It dynamically selects semantically relevant memory subsets for each agent based on its role and task stage.<n>A lightweight scoring policy guides memory selection, and agent outputs are integrated into a shared memory store.
arXiv Detail & Related papers (2025-08-06T21:59:34Z) - PAIRS: Parametric-Verified Adaptive Information Retrieval and Selection for Efficient RAG [14.631028226704883]
We introduce Parametric-verified Adaptive Information Retrieval and Selection (PAIRS)<n>PAIRS integrates parametric and retrieved knowledge to adaptively determine whether to retrieve and how to select external information.<n>We show that PAIRS reduces retrieval costs by around 25% (triggering for only 75% of queries) while still improving accuracy-achieving +1.1% EM and +1.0% F1 over prior baselines.
arXiv Detail & Related papers (2025-08-06T03:33:01Z) - DAMR: Efficient and Adaptive Context-Aware Knowledge Graph Question Answering with LLM-Guided MCTS [28.828541350757714]
This paper proposes Dynamically Adaptive MCTS-based Reasoning (DAMR) for Knowledge Graph Question Answering (KGQA)<n>DAMR integrates Monte Carlo Tree Search (MCTS) with adaptive path evaluation to enable context-aware KGQA.<n>Experiments on multiple KGQA benchmarks show DAMR significantly outperforms SOTA methods.
arXiv Detail & Related papers (2025-08-01T15:38:21Z) - Learning to Route Queries Across Knowledge Bases for Step-wise Retrieval-Augmented Reasoning [60.84901522792042]
Multimodal Retrieval-Augmented Generation (MRAG) has shown promise in mitigating hallucinations in Multimodal Large Language Models (MLLMs)<n>We propose R1, a novel MRAG framework that learns to decide when and where to retrieve knowledge based on the evolving reasoning state.<n>R1- can adaptively and effectively leverage diverse KBs, reducing unnecessary retrievals and improving both efficiency and accuracy.
arXiv Detail & Related papers (2025-05-28T08:17:57Z) - TrustRAG: Enhancing Robustness and Trustworthiness in Retrieval-Augmented Generation [31.231916859341865]
TrustRAG is a framework that systematically filters malicious and irrelevant content before it is retrieved for generation.<n>TrustRAG delivers substantial improvements in retrieval accuracy, efficiency, and attack resistance.
arXiv Detail & Related papers (2025-01-01T15:57:34Z) - Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation [64.7982176398485]
Retrieval-augmented generation (RAG) has demonstrated effectiveness in mitigating the hallucination problem of large language models (LLMs)
We propose DPA-RAG, a universal framework designed to align diverse knowledge preferences within RAG systems.
arXiv Detail & Related papers (2024-06-26T18:26:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.