Related papers: From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries

From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries

URL: http://arxiv.org/abs/2406.12824v1
Date: Tue, 18 Jun 2024 17:46:08 GMT
Title: From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries
Authors: Hitesh Wadhwa, Rahul Seetharaman, Somyaa Aggarwal, Reshmi Ghosh, Samyadeep Basu, Soundararajan Srinivasan, Wenlong Zhao, Shreyas Chaudhari, Ehsan Aghazadeh,
Abstract summary: Retrieval Augmented Generation (RAG) enriches the ability of language models to reason using external context to augment responses for a given user prompt. This approach has risen in popularity due to practical applications in various applications of language models in search, question/answering, and chat-bots. In this paper, we mechanistically examine the RAG pipeline to highlight that language models take shortcut and have a strong bias towards utilizing only the context information to answer the question, while relying minimally on their parametric memory.
Score: 6.382667978271587
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Retrieval Augmented Generation (RAG) enriches the ability of language models to reason using external context to augment responses for a given user prompt. This approach has risen in popularity due to practical applications in various applications of language models in search, question/answering, and chat-bots. However, the exact nature of how this approach works isn't clearly understood. In this paper, we mechanistically examine the RAG pipeline to highlight that language models take shortcut and have a strong bias towards utilizing only the context information to answer the question, while relying minimally on their parametric memory. We probe this mechanistic behavior in language models with: (i) Causal Mediation Analysis to show that the parametric memory is minimally utilized when answering a question and (ii) Attention Contributions and Knockouts to show that the last token residual stream do not get enriched from the subject token in the question, but gets enriched from other informative tokens in the context. We find this pronounced shortcut behaviour true across both LLaMa and Phi family of models.

Related papers

On Mechanistic Circuits for Extractive Question-Answering [47.167393805165325]
Large language models are increasingly used to process documents and facilitate question-answering on them. In our paper, we extract mechanistic circuits for this real-world language modeling task. We show the potential benefits of circuits towards downstream applications such as data attribution to context information.
arXiv Detail & Related papers (2025-02-12T01:54:21Z)
Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models. Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models. Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z)
Deciphering the Interplay of Parametric and Non-parametric Memory in Retrieval-augmented Language Models [5.274653527674298]
Retrieval-Augmented Generation (RAG) models act like retrieving information before generating responses. We use causal mediation analysis and controlled experiments to examine how internal representations influence information processing.
arXiv Detail & Related papers (2024-10-07T16:14:47Z)
Quantifying reliance on external information over parametric knowledge during Retrieval Augmented Generation (RAG) using mechanistic analysis [6.382667978271587]
This paper mechanistically examines the RAG pipeline to highlight that LMs demonstrate a "shortcut'' effect. We find this pronounced "shortcut'' behaviour to be true across both LLMs (e.g.,LlaMa) and SLMs (e.g., Phi)
arXiv Detail & Related papers (2024-10-01T16:48:13Z)
Integrating Large Language Models with Graph-based Reasoning for Conversational Question Answering [58.17090503446995]
We focus on a conversational question answering task which combines the challenges of understanding questions in context and reasoning over evidence gathered from heterogeneous sources like text, knowledge graphs, tables, and infoboxes. Our method utilizes a graph structured representation to aggregate information about a question and its context.
arXiv Detail & Related papers (2024-06-14T13:28:03Z)
Making Retrieval-Augmented Language Models Robust to Irrelevant Context [55.564789967211844]
An important desideratum of RALMs, is that retrieved information helps model performance when it is relevant. Recent work has shown that retrieval augmentation can sometimes have a negative effect on performance.
arXiv Detail & Related papers (2023-10-02T18:52:35Z)
RECKONING: Reasoning through Dynamic Knowledge Encoding [51.076603338764706]
We show that language models can answer questions by reasoning over knowledge provided as part of the context. In these situations, the model fails to distinguish the knowledge that is necessary to answer the question. We propose teaching the model to reason more robustly by folding the provided contextual knowledge into the model's parameters.
arXiv Detail & Related papers (2023-05-10T17:54:51Z)
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models [58.42146641102329]
We develop a novel semi-parametric language model architecture, Knowledge-in-Context (KiC) KiC empowers a parametric text-to-text language model with a knowledge-rich external memory. As a knowledge-rich semi-parametric language model, KiC only needs a much smaller part to achieve superior zero-shot performance on unseen tasks.
arXiv Detail & Related papers (2022-10-28T23:18:43Z)
Automatic Short Math Answer Grading via In-context Meta-learning [2.0263791972068628]
We study the problem of automatic short answer grading for students' responses to math questions. We use MathBERT, a variant of the popular language model BERT adapted to mathematical content, as our base model. Second, we use an in-context learning approach that provides scoring examples as input to the language model.
arXiv Detail & Related papers (2022-05-30T16:26:02Z)
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks [133.93803565077337]
retrieval-augmented generation models combine pre-trained parametric and non-parametric memory for language generation. We show that RAG models generate more specific, diverse and factual language than a state-of-the-art parametric-only seq2seq baseline.
arXiv Detail & Related papers (2020-05-22T21:34:34Z)
How Much Knowledge Can You Pack Into the Parameters of a Language Model? [44.81324633069311]
It has been observed that neural language models trained on unstructured text can implicitly store and retrieve knowledge using natural language queries. We measure the practical utility of this approach by fine-tuning pre-trained models to answer questions without access to any external context or knowledge.
arXiv Detail & Related papers (2020-02-10T18:55:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.