Related papers: Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

URL: http://arxiv.org/abs/2406.02818v1
Date: Tue, 4 Jun 2024 23:36:08 GMT
Title: Chain of Agents: Large Language Models Collaborating on Long-Context Tasks
Authors: Yusen Zhang, Ruoxi Sun, Yanfei Chen, Tomas Pfister, Rui Zhang, Sercan Ö. Arik,
Abstract summary: Chain-of-Agents (CoA) is a novel framework that harnesses multi-agent collaboration through natural language to enable information aggregation and context reasoning. CoA processes the entire input by interleaving reading and reasoning, and it mitigates long context focus issues by assigning each agent a short context.
Score: 39.27648679819897
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Addressing the challenge of effectively processing long contexts has become a critical issue for Large Language Models (LLMs). Two common strategies have emerged: 1) reducing the input length, such as retrieving relevant chunks by Retrieval-Augmented Generation (RAG), and 2) expanding the context window limit of LLMs. However, both strategies have drawbacks: input reduction has no guarantee of covering the part with needed information, while window extension struggles with focusing on the pertinent information for solving the task. To mitigate these limitations, we propose Chain-of-Agents (CoA), a novel framework that harnesses multi-agent collaboration through natural language to enable information aggregation and context reasoning across various LLMs over long-context tasks. CoA consists of multiple worker agents who sequentially communicate to handle different segmented portions of the text, followed by a manager agent who synthesizes these contributions into a coherent final output. CoA processes the entire input by interleaving reading and reasoning, and it mitigates long context focus issues by assigning each agent a short context. We perform comprehensive evaluation of CoA on a wide range of long-context tasks in question answering, summarization, and code completion, demonstrating significant improvements by up to 10% over strong baselines of RAG, Full-Context, and multi-agent LLMs.

Related papers

Long Context Scaling: Divide and Conquer via Multi-Agent Question-driven Collaboration [11.477571238310276]
We propose a novel multi-agent framework for processing long contexts.<n>XpandA (Expand-Agent) is coupled with question-driven workflow and dynamic partitioning.<n>XpandA achieves 20% improvements and 1.5x inference speedup over baselines of full-context, RAG and previous agent-based methods.
arXiv Detail & Related papers (2025-05-27T02:05:42Z)
Self-Taught Agentic Long Context Understanding [47.186303525057475]
AgenticLU is a framework designed to integrate targeted self-clarification with contextual grounding. AgenticLU achieves 97.8% answer recall on NarrativeQA with a search depth of up to three and a branching factor of eight.
arXiv Detail & Related papers (2025-02-21T20:29:36Z)
Emulating Retrieval Augmented Generation via Prompt Engineering for Enhanced Long Context Comprehension in LLMs [23.960451986662996]
This paper proposes a method that emulates Retrieval Augmented Generation (RAG) through specialized prompt engineering and chain-of-thought reasoning. We evaluate our approach on selected tasks from BABILong, which interleaves standard bAbI QA problems with large amounts of distractor text.
arXiv Detail & Related papers (2025-02-18T02:49:40Z)
Meta-Chunking: Learning Text Segmentation and Semantic Completion via Logical Perception [10.614437503578856]
This paper proposes the Meta-Chunking framework, which specifically enhances chunking quality.<n>We design two adaptive chunking techniques based on uncertainty, namely Perplexity Chunking and Margin Sampling Chunking.<n>We establish the global information compensation mechanism, encompassing a two-stage hierarchical summary generation process and a three-stage text chunk rewriting procedure.
arXiv Detail & Related papers (2024-10-16T17:59:32Z)
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data [6.195658947075431]
We introduce HoloBench, a framework that brings database reasoning operations into text-based contexts. We show that the amount of information in the context has a bigger influence on LCLM performance than the context length. We find that tasks requiring the aggregation of multiple pieces of information show a noticeable drop in accuracy as context length increases.
arXiv Detail & Related papers (2024-10-15T19:04:13Z)
LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models [73.13933847198395]
We propose a training-free framework for processing long texts, utilizing a divide-and-conquer strategy to achieve comprehensive document understanding. The proposed LLM$times$MapReduce framework splits the entire document into several chunks for LLMs to read and then aggregates the intermediate answers to produce the final output.
arXiv Detail & Related papers (2024-10-12T03:13:44Z)
FltLM: An Intergrated Long-Context Large Language Model for Effective Context Filtering and Understanding [32.197113821638936]
We propose a novel integrated Long-Context Large Language Model (FltLM) FltLM incorporates a context filter with a soft mask mechanism, identifying and dynamically excluding irrelevant content to concentrate on pertinent information. Experimental results demonstrate that FltLM significantly outperforms supervised fine-tuning and retrieval-based methods in complex QA scenarios.
arXiv Detail & Related papers (2024-10-09T13:47:50Z)
SEGMENT+: Long Text Processing with Short-Context Language Models [53.40059130780192]
SEGMENT+ is a framework that enables LMs to handle extended inputs within limited context windows efficiently. SEGMENT+ utilizes structured notes and a filtering module to manage information flow, resulting in a system that is both controllable and interpretable.
arXiv Detail & Related papers (2024-10-09T03:40:22Z)
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? [37.64593022203498]
NeedleBench is a framework consisting of progressively more challenging tasks for assessing bilingual long-context capabilities. We use the framework to assess how well the leading open-source models can identify key information relevant to the question. We propose the Ancestral Trace Challenge to mimic the complexity of logical reasoning challenges that are likely to be present in real-world long-context tasks.
arXiv Detail & Related papers (2024-07-16T17:59:06Z)
Large Multimodal Agents: A Survey [78.81459893884737]
Large language models (LLMs) have achieved superior performance in powering text-based AI agents. There is an emerging research trend focused on extending these LLM-powered AI agents into the multimodal domain. This review aims to provide valuable insights and guidelines for future research in this rapidly evolving field.
arXiv Detail & Related papers (2024-02-23T06:04:23Z)
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs) Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z)
LooGLE: Can Long-Context Language Models Understand Long Contexts? [46.143956498529796]
LooGLE is a benchmark for large language models' long context understanding. It features relatively new documents post-2022, with over 24,000 tokens per document and 6,000 newly generated questions spanning diverse domains. The evaluation of eight state-of-the-art LLMs on LooGLE revealed key findings.
arXiv Detail & Related papers (2023-11-08T01:45:37Z)
Recursion of Thought: A Divide-and-Conquer Approach to Multi-Context Reasoning with Language Models [58.41943058963672]
We propose a new inference framework called Recursion of Thought (RoT) RoT introduces several special tokens that the models can output to trigger context-related operations. Experiments with multiple architectures including GPT-3 show that RoT dramatically improves LMs' inference capability to solve problems.
arXiv Detail & Related papers (2023-06-12T06:34:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.