Related papers: Scaling Long-Horizon LLM Agent via Context-Folding

Scaling Long-Horizon LLM Agent via Context-Folding

URL: http://arxiv.org/abs/2510.11967v1
Date: Mon, 13 Oct 2025 22:00:58 GMT
Title: Scaling Long-Horizon LLM Agent via Context-Folding
Authors: Weiwei Sun, Miao Lu, Zhan Ling, Kang Liu, Xuesong Yao, Yiming Yang, Jiecao Chen,
Abstract summary: We introduce Context-Folding, a framework that empowers agents to actively manage their working context.<n>An agent can procedurally branch into a sub-trajectory to handle a subtask and then fold it upon completion, collapsing the intermediate steps while retaining a concise summary of the outcome.
Score: 46.685552398338295
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language model (LLM) agents are fundamentally constrained by context length on long-horizon tasks. We introduce Context-Folding, a framework that empowers agents to actively manage their working context. An agent can procedurally branch into a sub-trajectory to handle a subtask and then fold it upon completion, collapsing the intermediate steps while retaining a concise summary of the outcome. To make this behavior learnable, we develop an end-to-end reinforcement learning framework FoldGRPO with specific process rewards to encourage effective task decomposition and context management. On complex long-horizon tasks (Deep Research and SWE), our folding agent matches or outperforms the ReAct baselines while using an active context 10$\times$ smaller and significantly outperforms models that rely on summarization-based context management.

Related papers

LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth [32.1520194112537]
Large language models (LLMs) are increasingly capable of carrying out long-running, real-world tasks.<n>As the amount of context grows, their reliability often deteriorates, a phenomenon known as "context rot"
arXiv Detail & Related papers (2026-02-08T13:20:39Z)
Context as a Tool: Context Management for Long-Horizon SWE-Agents [38.950807465620365]
We propose CAT, a new context management paradigm that elevates context maintenance to a callable tool integrated into the decision-making process of agents.<n> CAT formalizes a structured context workspace consisting of stable task semantics, condensed long-term memory, and high-fidelity short-term interactions.<n>We show that SWE-Compressor reaches a 57.6% solved rate and significantly outperforms ReAct-based agents and static compression baselines.
arXiv Detail & Related papers (2025-12-26T17:15:47Z)
CoDA: A Context-Decoupled Hierarchical Agent with Reinforcement Learning [12.710191300398924]
We introduce CoDA, a reinforcement learning framework that decouples high-level planning from low-level execution.<n>CoDA achieves significant performance improvements over state-of-the-art baselines on complex multi-hop question-answering benchmarks.
arXiv Detail & Related papers (2025-12-14T14:41:29Z)
Self-Abstraction from Grounded Experience for Plan-Guided Policy Refinement [61.35824395228412]
Large language model (LLM) based agents are increasingly used to tackle software engineering tasks.<n>We propose Self-Abstraction from Grounded Experience (SAGE), a framework that enables agents to learn from their own task executions.
arXiv Detail & Related papers (2025-11-08T08:49:38Z)
AgentFold: Long-Horizon Web Agents with Proactive Context Management [98.54523771369018]
LLM-based web agents show immense promise for information seeking, yet their effectiveness is hindered by a fundamental trade-off in context management.<n>We introduce AgentFold, a novel agent paradigm centered on proactive context management.<n>With simple supervised fine-tuning, our AgentFold-30B-A3B agent achieves 36.2% on BrowseComp and 47.3% on BrowseComp-ZH.
arXiv Detail & Related papers (2025-10-28T17:51:50Z)
COMPASS: Enhancing Agent Long-Horizon Reasoning with Evolving Context [17.575806280348797]
Small errors compound across steps, and even state-of-the-art models often hallucinate or lose coherence.<n>We propose a lightweight hierarchical framework that separates tactical execution, strategic oversight, and context organization into three specialized components.
arXiv Detail & Related papers (2025-10-09T20:14:26Z)
Scaling LLM Multi-turn RL with End-to-end Summarization-based Context Management [19.980762483472354]
We introduce summarization-based context management to training.<n>We instantiate this framework with underlineSUmmarization augmented underlinePolicy underlineOptimization (textttSUPO)<n>Our results establish summarization-based context management as a principled and scalable approach for training RL agents beyond a fixed context length limit.
arXiv Detail & Related papers (2025-10-08T07:29:22Z)
ContextNav: Towards Agentic Multimodal In-Context Learning [85.05420047017513]
ContextNav is an agentic framework that integrates the scalability of automated retrieval with the quality and adaptiveness of human-like curation.<n>It builds a resource-aware multimodal embedding pipeline, maintains a retrievable vector database, and applies agentic retrieval and structural alignment to construct noise-resilient contexts.<n> Experimental results demonstrate that ContextNav achieves state-of-the-art performance across various datasets.
arXiv Detail & Related papers (2025-10-06T07:49:52Z)
Process-Supervised Reinforcement Learning for Interactive Multimodal Tool-Use Agents [34.720205364467546]
We introduce a sandbox environment for reinforcement learning (RL) that supports interleaved speech-text rollouts.<n>Our core strategy, Turn-level Adjudicated Reinforcement Learning (TARL), addresses the challenge of credit assignment in long-horizon tasks.<n>This unified approach boosts the task pass rate on the text-based $tau$-bench by over 6% compared to strong RL baselines.
arXiv Detail & Related papers (2025-09-17T23:25:00Z)
Chain of Agents: Large Language Models Collaborating on Long-Context Tasks [39.27648679819897]
Chain-of-Agents (CoA) is a novel framework that harnesses multi-agent collaboration through natural language to enable information aggregation and context reasoning. CoA processes the entire input by interleaving reading and reasoning, and it mitigates long context focus issues by assigning each agent a short context.
arXiv Detail & Related papers (2024-06-04T23:36:08Z)
ADaPT: As-Needed Decomposition and Planning with Language Models [131.063805299796]
We introduce As-Needed Decomposition and Planning for complex Tasks (ADaPT) ADaPT explicitly plans and decomposes complex sub-tasks as-needed, when the Large Language Models is unable to execute them. Our results demonstrate that ADaPT substantially outperforms established strong baselines.
arXiv Detail & Related papers (2023-11-08T17:59:15Z)
Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA) SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning. SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.