Related papers: Enhancing Long Context Performance in LLMs Through Inner Loop Query Mechanism

Enhancing Long Context Performance in LLMs Through Inner Loop Query Mechanism

URL: http://arxiv.org/abs/2410.12859v1
Date: Fri, 11 Oct 2024 19:49:05 GMT
Title: Enhancing Long Context Performance in LLMs Through Inner Loop Query Mechanism
Authors: Yimin Tang, Yurong Xu, Ning Yan, Masood Mortazavi,
Abstract summary: Transformers have a quadratic scaling of computational complexity with input size. Retrieval-augmented generation (RAG) can better handle longer contexts by using a retrieval system. We introduce a novel approach, Inner Loop Memory Augmented Tree Retrieval (ILM-TR)
Score: 2.919891871101241
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transformers have a quadratic scaling of computational complexity with input size, which limits the input context window size of large language models (LLMs) in both training and inference. Meanwhile, retrieval-augmented generation (RAG) besed models can better handle longer contexts by using a retrieval system to filter out unnecessary information. However, most RAG methods only perform retrieval based on the initial query, which may not work well with complex questions that require deeper reasoning. We introduce a novel approach, Inner Loop Memory Augmented Tree Retrieval (ILM-TR), involving inner-loop queries, based not only on the query question itself but also on intermediate findings. At inference time, our model retrieves information from the RAG system, integrating data from lengthy documents at various levels of abstraction. Based on the information retrieved, the LLM generates texts stored in an area named Short-Term Memory (STM) which is then used to formulate the next query. This retrieval process is repeated until the text in STM converged. Our experiments demonstrate that retrieval with STM offers improvements over traditional retrieval-augmented LLMs, particularly in long context tests such as Multi-Needle In A Haystack (M-NIAH) and BABILong.

Related papers

ELITE: Embedding-Less retrieval with Iterative Text Exploration [5.8851517822935335]
Large Language Models (LLMs) have achieved impressive progress in natural language processing.<n>Their limited ability to retain long-term context constrains performance on document-level or multi-turn tasks.
arXiv Detail & Related papers (2025-05-17T08:48:43Z)
Emulating Retrieval Augmented Generation via Prompt Engineering for Enhanced Long Context Comprehension in LLMs [23.960451986662996]
This paper proposes a method that emulates Retrieval Augmented Generation (RAG) through specialized prompt engineering and chain-of-thought reasoning. We evaluate our approach on selected tasks from BABILong, which interleaves standard bAbI QA problems with large amounts of distractor text.
arXiv Detail & Related papers (2025-02-18T02:49:40Z)
Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method [48.14236175156835]
ARM aims to better align the question with the organization of the data collection by exploring relationships among data objects. It outperforms standard RAG with query decomposition by up to 5.2 pt in execution accuracy and agentic RAG (ReAct) by up to 15.9 pt. It achieves up to 5.5 pt and 19.3 pt higher F1 match scores compared to these approaches.
arXiv Detail & Related papers (2025-01-30T18:07:19Z)
MM-Embed: Universal Multimodal Retrieval with Multimodal LLMs [78.5013630951288]
This paper introduces techniques for advancing information retrieval with multimodal large language models (MLLMs) We first study fine-tuning an MLLM as a bi-encoder retriever on 10 datasets with 16 retrieval tasks. We propose modality-aware hard negative mining to mitigate the modality bias exhibited by MLLM retrievers.
arXiv Detail & Related papers (2024-11-04T20:06:34Z)
Better RAG using Relevant Information Gain [1.5604249682593647]
A common way to extend the memory of large language models (LLMs) is by retrieval augmented generation (RAG) We propose a novel simple optimization metric based on relevant information gain, a probabilistic measure of the total information relevant to a query for a set of retrieved results. When used as a drop-in replacement for the retrieval component of a RAG system, this method yields state-of-the-art performance on question answering tasks.
arXiv Detail & Related papers (2024-07-16T18:09:21Z)
LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document Summarization [9.739781953744606]
Open-Domain Multi-Document Summarization (ODMDS) is the task of generating summaries from large document collections in response to user queries. Traditional retrieve-then-summarize approaches fall short for open-ended queries in ODMDS tasks. We propose LightPAL, a lightweight passage retrieval method for ODMDS.
arXiv Detail & Related papers (2024-06-18T10:57:27Z)
Toward Conversational Agents with Context and Time Sensitive Long-term Memory [8.085414868117917]
Until recently, most work on RAG has focused on information retrieval from large databases of texts, like Wikipedia. We argue that effective retrieval from long-form conversational data faces two unique problems compared to static database retrieval. We generate a new dataset of ambiguous and time-based questions that build upon a recent dataset of long-form, simulated conversations.
arXiv Detail & Related papers (2024-05-29T18:19:46Z)
Question-Based Retrieval using Atomic Units for Enterprise RAG [3.273958158967657]
Enterprise retrieval augmented generation (RAG) offers a flexible framework for combining powerful large language models (LLMs) with internal, possibly temporally changing, documents. This work applies a zero-shot adaptation of standard dense retrieval steps for more accurate chunk recall.
arXiv Detail & Related papers (2024-05-20T20:27:00Z)
Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES. Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query. By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z)
Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline. This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z)
Synergistic Interplay between Search and Large Language Models for Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections. InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z)
Large Language Models are Strong Zero-Shot Retriever [89.16756291653371]
We propose a simple method that applies a large language model (LLM) to large-scale retrieval in zero-shot scenarios. Our method, the Language language model as Retriever (LameR), is built upon no other neural models but an LLM.
arXiv Detail & Related papers (2023-04-27T14:45:55Z)
Query2doc: Query Expansion with Large Language Models [69.9707552694766]
The proposed method first generates pseudo- documents by few-shot prompting large language models (LLMs) query2doc boosts the performance of BM25 by 3% to 15% on ad-hoc IR datasets. Our method also benefits state-of-the-art dense retrievers in terms of both in-domain and out-of-domain results.
arXiv Detail & Related papers (2023-03-14T07:27:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.