Related papers: Unlocking the Potentials of Retrieval-Augmented Generation for Diffusion Language Models

Unlocking the Potentials of Retrieval-Augmented Generation for Diffusion Language Models

URL: http://arxiv.org/abs/2601.11342v1
Date: Fri, 16 Jan 2026 14:45:46 GMT
Title: Unlocking the Potentials of Retrieval-Augmented Generation for Diffusion Language Models
Authors: Chuanyue Yu, Jiahui Wang, Yuhan Li, Heng Chang, Ge Lan, Qingyun Sun, Jia Li, Jianxin Li, Ziwei Zhang,
Abstract summary: Retrieval-Augmented Generation shows great successes for enhancing large language models (LLMs)<n>We show that DLMs coupled with RAG show promising potentials with stronger dependency on contextual information, but suffer from limited generation precision.<n>We propose Semantic-Preserving REtrieval-Augmented Diffusion (SPREAD), a novel framework that introduces a query-relevance-guided denoising strategy.
Score: 38.148737920360766
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Diffusion Language Models (DLMs) have recently demonstrated remarkable capabilities in natural language processing tasks. However, the potential of Retrieval-Augmented Generation (RAG), which shows great successes for enhancing large language models (LLMs), has not been well explored, due to the fundamental difference between LLM and DLM decoding. To fill this critical gap, we systematically test the performance of DLMs within the RAG framework. Our findings reveal that DLMs coupled with RAG show promising potentials with stronger dependency on contextual information, but suffer from limited generation precision. We identify a key underlying issue: Response Semantic Drift (RSD), where the generated answer progressively deviates from the query's original semantics, leading to low precision content. We trace this problem to the denoising strategies in DLMs, which fail to maintain semantic alignment with the query throughout the iterative denoising process. To address this, we propose Semantic-Preserving REtrieval-Augmented Diffusion (SPREAD), a novel framework that introduces a query-relevance-guided denoising strategy. By actively guiding the denoising trajectory, SPREAD ensures the generation remains anchored to the query's semantics and effectively suppresses drift. Experimental results demonstrate that SPREAD significantly enhances the precision and effectively mitigates RSD of generated answers within the RAG framework.

Related papers

Reproducing and Dissecting Denoising Language Models for Speech Recognition [31.91567892562116]
Denoising language models (DLMs) have been proposed as a powerful alternative to traditional language models (LMs) for automatic speech recognition (ASR)<n>This paper presents the first independent, large-scale empirical study of DLMs.
arXiv Detail & Related papers (2025-12-15T17:33:22Z)
DRES: Benchmarking LLMs for Disfluency Removal [27.083825614818135]
Disfluencies, such as "um," "uh," interjections, parentheticals, and edited statements, remain a persistent challenge for speech-driven systems.<n>Disfluency Removal Evaluation Suite, a controlled text-level benchmark, establishes a reproducible semantic upper bound for this task.<n>DRES builds on human-annotated Switchboard transcripts, isolating disfluency removal from ASR errors and acoustic variability.
arXiv Detail & Related papers (2025-09-24T17:08:12Z)
RADIANT: Retrieval AugmenteD entIty-context AligNmenT -- Introducing RAG-ability and Entity-Context Divergence [18.268335797537983]
Retrieval-Augmented Generation (RAG) is a technique to enhance factual accuracy by integrating external knowledge into the generation process.<n>This paper introduces Radiant, a framework that merges RAG with alignment designed to optimize the interplay between retrieved evidence and generated content.
arXiv Detail & Related papers (2025-06-28T21:40:35Z)
SUTA-LM: Bridging Test-Time Adaptation and Language Model Rescoring for Robust ASR [58.31068047426522]
Test-Time Adaptation (TTA) aims to mitigate by adjusting models during inference.<n>Recent work explores combining TTA with external language models, using techniques like beam search rescoring or generative error correction.<n>We propose SUTA-LM, a simple yet effective extension of SUTA, with language model rescoring.<n> Experiments on 18 diverse ASR datasets show that SUTA-LM achieves robust results across a wide range of domains.
arXiv Detail & Related papers (2025-06-10T02:50:20Z)
Attributing Response to Context: A Jensen-Shannon Divergence Driven Mechanistic Study of Context Attribution in Retrieval-Augmented Generation [52.3707788779464]
We introduce a novel Jensen-Shannon Divergence driven method to Attribute Response to Context (ARC-JSD)<n>ARC-JSD enables efficient and accurate identification of essential context sentences without additional fine-tuning, gradient-calculation or surrogate modelling.<n> Evaluations on a wide range of RAG benchmarks, such as TyDi QA, Hotpot QA, and Musique, using instruction-tuned LLMs in different scales demonstrate superior accuracy and significant computational efficiency improvements.
arXiv Detail & Related papers (2025-05-22T09:04:03Z)
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales [14.655518998487237]
We propose InstructRAG, where LMs explicitly learn the denoising process through self-synthesized rationales.<n>InstructRAG requires no additional supervision, allows for easier verification of the predicted answers.<n>Experiments show InstructRAG consistently outperforms existing RAG methods in both training-free and trainable scenarios.
arXiv Detail & Related papers (2024-06-19T15:25:29Z)
Prompt Perturbation in Retrieval-Augmented Generation based Large Language Models [9.688626139309013]
Retrieval-Augmented Generation is considered as a means to improve the trustworthiness of text generation from large language models. In this work, we find that the insertion of even a short prefix to the prompt leads to the generation of outputs far away from factually correct answers. We introduce a novel optimization technique called Gradient Guided Prompt Perturbation.
arXiv Detail & Related papers (2024-02-11T12:25:41Z)
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition [65.95847272465124]
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR) In this work, we extend the benchmark to noisy conditions and investigate if we can teach LLMs to perform denoising for GER. Experiments on various latest LLMs demonstrate our approach achieves a new breakthrough with up to 53.9% correction improvement in terms of word error rate.
arXiv Detail & Related papers (2024-01-19T01:29:27Z)
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG) Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection. It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z)
Benchmarking Large Language Models in Retrieval-Augmented Generation [53.504471079548]
We systematically investigate the impact of Retrieval-Augmented Generation on large language models. We analyze the performance of different large language models in 4 fundamental abilities required for RAG. We establish Retrieval-Augmented Generation Benchmark (RGB), a new corpus for RAG evaluation in both English and Chinese.
arXiv Detail & Related papers (2023-09-04T08:28:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.