Related papers: Understanding Retrieval Augmentation for Long-Form Question Answering

Understanding Retrieval Augmentation for Long-Form Question Answering

URL: http://arxiv.org/abs/2310.12150v1
Date: Wed, 18 Oct 2023 17:59:10 GMT
Title: Understanding Retrieval Augmentation for Long-Form Question Answering
Authors: Hung-Ting Chen, Fangyuan Xu, Shane A. Arora, Eunsol Choi
Abstract summary: We present a study of retrieval-augmented language models (LMs) on long-form question answering. We analyze how retrieval augmentation impacts different LMs, by comparing answers generated from models while using the same evidence documents.
Score: 44.19142029392175
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a study of retrieval-augmented language models (LMs) on long-form question answering. We analyze how retrieval augmentation impacts different LMs, by comparing answers generated from models while using the same evidence documents, and how differing quality of retrieval document set impacts the answers generated from the same LM. We study various attributes of generated answers (e.g., fluency, length, variance) with an emphasis on the attribution of generated long-form answers to in-context evidence documents. We collect human annotations of answer attribution and evaluate methods for automatically judging attribution. Our study provides new insights on how retrieval augmentation impacts long, knowledge-rich text generation of LMs. We further identify attribution patterns for long text generation and analyze the main culprits of attribution errors. Together, our analysis reveals how retrieval augmentation impacts long knowledge-rich text generation and provide directions for future work.

Related papers

Retrieval-Augmented Visual Question Answering via Built-in Autoregressive Search Engines [17.803396998387665]
Retrieval-augmented generation (RAG) has emerged to address the knowledge-intensive visual question answering (VQA) task. We propose ReAuSE, an alternative to the previous RAG model for the knowledge-based VQA task. Our model functions both as a generative retriever and an accurate answer generator.
arXiv Detail & Related papers (2025-02-23T16:39:39Z)
Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that the pointwise mutual information between a context and a question is an effective gauge for language model performance. We propose two methods that use the pointwise mutual information between a document and a question as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z)
Enhancing Answer Attribution for Faithful Text Generation with Large Language Models [5.065947993017158]
We propose new methods for producing more independent and contextualized claims for better retrieval and attribution. New methods are evaluated and shown to improve the performance of answer attribution components.
arXiv Detail & Related papers (2024-10-22T15:37:46Z)
Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition [10.585679421637948]
Post-hoc attribution systems are designed to map answer text back to the source document, yet the granularity of this mapping has not been addressed. We propose and investigate a novel approach to the factual decomposition of generated answers for attribution, employing template-based in-context learning.
arXiv Detail & Related papers (2024-09-25T16:32:35Z)
Analysis of Plan-based Retrieval for Grounded Text Generation [78.89478272104739]
hallucinations occur when a language model is given a generation task outside its parametric knowledge. A common strategy to address this limitation is to infuse the language models with retrieval mechanisms. We analyze how planning can be used to guide retrieval to further reduce the frequency of hallucinations.
arXiv Detail & Related papers (2024-08-20T02:19:35Z)
Attribute or Abstain: Large Language Models as Long Document Assistants [58.32043134560244]
LLMs can help humans working with long documents, but are known to hallucinate. Existing approaches to attribution have only been evaluated in RAG settings, where the initial retrieval confounds LLM performance. This is crucially different from the long document setting, where retrieval is not needed, but could help. We present LAB, a benchmark of 6 diverse long document tasks with attribution, and experiments with different approaches to attribution on 5 LLMs of different sizes.
arXiv Detail & Related papers (2024-07-10T16:16:02Z)
Context-augmented Retrieval: A Novel Framework for Fast Information Retrieval based Response Generation using Large Language Model [0.0]
As the corpus of contextual information grows, the answer/inference quality of Retrieval Augmented Generation (RAG) based Question Answering (QA) systems declines. This work solves this problem by combining classical text classification with the Large Language Model (LLM) New approach Context Augmented retrieval (CAR) demonstrates good quality answer generation along with significant reduction in information retrieval and answer generation time.
arXiv Detail & Related papers (2024-06-24T07:52:05Z)
Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study [61.74571814707054]
We evaluate whether every generated sentence is grounded in retrieved documents or the model's pre-training data. Across 3 datasets and 4 model families, our findings reveal that a significant fraction of generated sentences are consistently ungrounded. Our results show that while larger models tend to ground their outputs more effectively, a significant portion of correct answers remains compromised by hallucinations.
arXiv Detail & Related papers (2024-04-10T14:50:10Z)
Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context [4.1229332722825]
This paper introduces a novel framework that combines graph-driven context retrieval in conjunction to knowledge graphs based enhancement. We conduct experiments on various Large Language Models (LLMs) with different parameter sizes to evaluate their ability to ground knowledge and determine factual accuracy in answers to open-ended questions. Our methodology GraphContextGen consistently outperforms dominant text-based retrieval systems, demonstrating its robustness and adaptability to a larger number of use cases.
arXiv Detail & Related papers (2024-01-23T11:25:34Z)
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z)
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG) Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection. It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.