Related papers: The Role of Parametric Injection-A Systematic Study of Parametric Retrieval-Augmented Generation

The Role of Parametric Injection-A Systematic Study of Parametric Retrieval-Augmented Generation

URL: http://arxiv.org/abs/2510.12668v1
Date: Tue, 14 Oct 2025 16:05:01 GMT
Title: The Role of Parametric Injection-A Systematic Study of Parametric Retrieval-Augmented Generation
Authors: Minghao Tang, Shiyu Ni, Jingtong Wu, Zengxin Han, Keping Bi,
Abstract summary: Paranoid retrieval-augmented generation (PRAG) encodes documents as model parameters and injects these representations into the model during inference.<n>We show that PRAG captures only partial semantic information of documents, and relying on them alone yields inferior performance compared to interaction at text level.<n>When combined parameterized documents with textual documents, the model can leverage relevant information more effectively and become more robust to noisy inputs.
Score: 8.544971676258971
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving external documents. As an emerging form of RAG, parametric retrieval-augmented generation (PRAG) encodes documents as model parameters (i.e., LoRA modules) and injects these representations into the model during inference, enabling interaction between the LLM and documents at parametric level. Compared with directly placing documents in the input context, PRAG is more efficient and has the potential to offer deeper model-document interaction. Despite its growing attention, the mechanism underlying parametric injection remains poorly understood. In this work, we present a systematic study of PRAG to clarify the role of parametric injection, showing that parameterized documents capture only partial semantic information of documents, and relying on them alone yields inferior performance compared to interaction at text level. However, these parametric representations encode high-level document information that can enhance the model's understanding of documents within the input context. When combined parameterized documents with textual documents, the model can leverage relevant information more effectively and become more robust to noisy inputs, achieving better performance than either source alone. We recommend jointly using parameterized and textual documents and advocate for increasing the information content of parametric representations to advance PRAG.

Related papers

DiffuGR: Generative Document Retrieval with Diffusion Language Models [80.78126312115087]
We propose generative document retrieval with diffusion language models, dubbed DiffuGR.<n>For inference, DiffuGR attempts to generate DocID tokens in parallel and refine them through a controllable number of denoising steps.<n>In contrast to conventional left-to-right auto-regressive decoding, DiffuGR provides a novel mechanism to first generate more confident DocID tokens.
arXiv Detail & Related papers (2025-11-11T12:00:09Z)
Model-Document Protocol for AI Search [11.377241012645994]
We introduce the Model-Document Protocol (MDP), a general framework that formalizes how raw text is bridged to large language models (LLMs)<n>Rather than treating retrieval as passage fetching, MDP defines multiple pathways that transform unstructured documents into task-specific, LLM-ready inputs.<n>As an instantiation, we present MDP-Agent, which realizes the protocol through an agentic process.
arXiv Detail & Related papers (2025-10-29T04:29:17Z)
Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding [61.36285696607487]
Document understanding is critical for applications from financial analysis to scientific discovery.<n>Current approaches, whether OCR-based pipelines feeding Large Language Models (LLMs) or native Multimodal LLMs (MLLMs) face key limitations.<n>Retrieval-Augmented Generation (RAG) helps ground models in external data, but documents' multimodal nature, combining text, tables, charts, and layout, demands a more advanced paradigm: Multimodal RAG.
arXiv Detail & Related papers (2025-10-17T02:33:16Z)
ABCD-LINK: Annotation Bootstrapping for Cross-Document Fine-Grained Links [57.514511353084565]
We introduce a new domain-agnostic framework for selecting a best-performing approach and annotating cross-document links.<n>We apply our framework in two distinct domains -- peer review and news.<n>The resulting novel datasets lay foundation for numerous cross-document tasks like media framing and peer review.
arXiv Detail & Related papers (2025-09-01T11:32:24Z)
Privacy-Preserving Reasoning with Knowledge-Distilled Parametric Retrieval Augmented Generation [37.36013238444145]
Parametric RAG (PRAG) addresses this by encoding documents as LoRA within LLMs, enabling reasoning without exposing raw content.<n>We propose DistilledPRAG, a knowledge-distilled parametric RAG model aligned with standard RAG in document structure and parameter activation.<n>Experiments on four QA datasets show that DistilledPRAG outperforms baselines in accuracy and generalizes well on OOD data.
arXiv Detail & Related papers (2025-09-01T03:23:57Z)
Rational Retrieval Acts: Leveraging Pragmatic Reasoning to Improve Sparse Retrieval [29.652506774818267]
Current sparse neural information retrieval methods do not take into account the document collection and the complex interplay between different term weights when representing a single document.<n>We show how the Rational Speech Acts (RSA), a linguistics framework used to minimize the number of features to be communicated when identifying an object in a set, can be adapted to the IR case.<n>Experiments show that incorporating RSA consistently improves multiple sparse retrieval models and state-of-the-art performance on out-of-domain datasets.
arXiv Detail & Related papers (2025-05-06T16:21:10Z)
Cognitive-Aligned Document Selection for Retrieval-augmented Generation [2.9060210098040855]
We propose GGatrieval to dynamically update queries and filter high-quality, reliable retrieval documents.<n>We parse the user query into its syntactic components and perform fine-grained grounded alignment with the retrieved documents.<n>Our approach introduces a novel criterion for filtering retrieved documents, closely emulating human strategies for acquiring targeted information.
arXiv Detail & Related papers (2025-02-17T13:00:15Z)
DOGR: Leveraging Document-Oriented Contrastive Learning in Generative Retrieval [10.770281363775148]
We propose a novel and general generative retrieval framework, namely Leveraging Document-Oriented Contrastive Learning in Generative Retrieval (DOGR)<n>It adopts a two-stage learning strategy that captures the relationship between queries and documents comprehensively through direct interactions.<n>Negative sampling methods and corresponding contrastive learning objectives are implemented to enhance the learning of semantic representations.
arXiv Detail & Related papers (2025-02-11T03:25:42Z)
Parametric Retrieval Augmented Generation [32.29608109539912]
Parametric RAG is a new RAG paradigm that integrates external knowledge directly into the parameters of feed-forward networks.<n>It substantially enhances both the effectiveness and efficiency of knowledge augmentation in large language models.
arXiv Detail & Related papers (2025-01-27T10:04:49Z)
Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models. Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models. Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z)
Efficient Document Ranking with Learnable Late Interactions [73.41976017860006]
Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval. To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized query and document embeddings. Recently, late-interaction models have been proposed to realize more favorable latency-quality tradeoffs, by using a DE structure followed by a lightweight scorer.
arXiv Detail & Related papers (2024-06-25T22:50:48Z)
Continual Learning for Generative Retrieval over Dynamic Corpora [115.79012933205756]
Generative retrieval (GR) directly predicts the identifiers of relevant documents (i.e., docids) based on a parametric model.<n>The ability to incrementally index new documents while preserving the ability to answer queries is vital to applying GR models.<n>We put forward a novel Continual-LEarner for generatiVE Retrieval (CLEVER) model and make two major contributions to continual learning for GR.
arXiv Detail & Related papers (2023-08-29T01:46:06Z)
UnifieR: A Unified Retriever for Large-Scale Retrieval [84.61239936314597]
Large-scale retrieval is to recall relevant documents from a huge collection given a query. Recent retrieval methods based on pre-trained language models (PLM) can be coarsely categorized into either dense-vector or lexicon-based paradigms. We propose a new learning framework, UnifieR which unifies dense-vector and lexicon-based retrieval in one model with a dual-representing capability.
arXiv Detail & Related papers (2022-05-23T11:01:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.