Can't Remember Details in Long Documents? You Need Some R&R
- URL: http://arxiv.org/abs/2403.05004v1
- Date: Fri, 8 Mar 2024 03:03:20 GMT
- Title: Can't Remember Details in Long Documents? You Need Some R&R
- Authors: Devanshu Agrawal, Shang Gao, Martin Gajek
- Abstract summary: We introduce $textitR&R$ -- a combination of two novel prompt-based methods.
In reprompting, we repeat the prompt instructions periodically throughout the context document.
In ICR, rather than instructing the LLM to answer the question directly, we instruct it to retrieve the top $k$ passage numbers.
- Score: 4.465645631325957
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Long-context large language models (LLMs) hold promise for tasks such as
question-answering (QA) over long documents, but they tend to miss important
information in the middle of context documents (arXiv:2307.03172v3). Here, we
introduce $\textit{R&R}$ -- a combination of two novel prompt-based methods
called $\textit{reprompting}$ and $\textit{in-context retrieval}$ (ICR) -- to
alleviate this effect in document-based QA. In reprompting, we repeat the
prompt instructions periodically throughout the context document to remind the
LLM of its original task. In ICR, rather than instructing the LLM to answer the
question directly, we instruct it to retrieve the top $k$ passage numbers most
relevant to the given question, which are then used as an abbreviated context
in a second QA prompt. We test R&R with GPT-4 Turbo and Claude-2.1 on documents
up to 80k tokens in length and observe a 16-point boost in QA accuracy on
average. Our further analysis suggests that R&R improves performance on long
document-based QA because it reduces the distance between relevant context and
the instructions. Finally, we show that compared to short-context chunkwise
methods, R&R enables the use of larger chunks that cost fewer LLM calls and
output tokens, while minimizing the drop in accuracy.
Related papers
- LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models [73.13933847198395]
We propose a training-free framework for processing long texts, utilizing a divide-and-conquer strategy to achieve comprehensive document understanding.
The proposed LLM$times$MapReduce framework splits the entire document into several chunks for LLMs to read and then aggregates the intermediate answers to produce the final output.
arXiv Detail & Related papers (2024-10-12T03:13:44Z) - ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities [53.97515452727115]
ChatQA 2 is a Llama 3.0-based model with a 128K context window.
We present a training recipe to extend the context window of Llama3-70B-base from 8K to 128K tokens.
Our results demonstrate that the Llama3-ChatQA-2-70B model outperforms most existing state-of-the-art models.
arXiv Detail & Related papers (2024-07-19T17:35:47Z) - Refiner: Restructure Retrieval Content Efficiently to Advance Question-Answering Capabilities [30.1331670544648]
Large Language Models (LLMs) are limited by their parametric knowledge, leading to hallucinations in knowledge-extensive tasks.
We propose $textitRefiner$, an end-to-end extract-and-restructure paradigm that operates in the post-retrieval process of RAG.
arXiv Detail & Related papers (2024-06-17T09:25:10Z) - DR-RAG: Applying Dynamic Document Relevance to Retrieval-Augmented Generation for Question-Answering [4.364937306005719]
RAG has recently demonstrated the performance of Large Language Models (LLMs) in the knowledge-intensive tasks such as Question-Answering (QA)
We have found that even though there is low relevance between some critical documents and query, it is possible to retrieve the remaining documents by combining parts of the documents with the query.
A two-stage retrieval framework called Dynamic-Relevant Retrieval-Augmented Generation (DR-RAG) is proposed to improve document retrieval recall and the accuracy of answers.
arXiv Detail & Related papers (2024-06-11T15:15:33Z) - LLoCO: Learning Long Contexts Offline [63.3458260335454]
We propose LLoCO, a novel approach to processing long contexts.
LLoCO learns contexts offline through context compression and in-domain parameter-efficient finetuning with LoRA.
Our approach extends the effective context window of a 4k token LLaMA2-7B model to handle up to 128k tokens.
arXiv Detail & Related papers (2024-04-11T17:57:22Z) - NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens [63.7488938083696]
NovelQA is a benchmark designed to test the capabilities of Large Language Models with extended texts.
This paper presents the design and construction of NovelQA, highlighting its manual annotation, and diverse question types.
Our evaluation of Long-context LLMs on NovelQA reveals significant insights into the models' performance.
arXiv Detail & Related papers (2024-03-18T17:32:32Z) - Drilling Down into the Discourse Structure with LLMs for Long Document
Question Answering [5.022057415488129]
We propose a suite of techniques that exploit the discourse structure commonly found in documents.
We show how our approach can be combined with textitself-ask reasoning agent to achieve best zero-shot performance in complex multi-hop question answering.
arXiv Detail & Related papers (2023-11-22T18:22:56Z) - DAPR: A Benchmark on Document-Aware Passage Retrieval [57.45793782107218]
We propose and name this task emphDocument-Aware Passage Retrieval (DAPR)
While analyzing the errors of the State-of-The-Art (SoTA) passage retrievers, we find the major errors (53.5%) are due to missing document context.
Our created benchmark enables future research on developing and comparing retrieval systems for the new task.
arXiv Detail & Related papers (2023-05-23T10:39:57Z) - Interleaving Retrieval with Chain-of-Thought Reasoning for
Knowledge-Intensive Multi-Step Questions [50.114651561111245]
We propose IRCoT, a new approach for multi-step question answering.
It interleaves retrieval with steps in a CoT, guiding the retrieval with CoT and in turn using retrieved results to improve CoT.
arXiv Detail & Related papers (2022-12-20T18:26:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.