Related papers: ReCode: Unify Plan and Action for Universal Granularity Control

ReCode: Unify Plan and Action for Universal Granularity Control

URL: http://arxiv.org/abs/2510.23564v2
Date: Tue, 28 Oct 2025 03:22:35 GMT
Title: ReCode: Unify Plan and Action for Universal Granularity Control
Authors: Zhaoyang Yu, Jiayi Zhang, Huixue Su, Yufan Zhao, Yifan Wu, Mingyi Deng, Jinyu Xiang, Yizhang Lin, Lingxiao Tang, Yingchao Li, Yuyu Luo, Bang Liu, Chenglin Wu,
Abstract summary: Real-world tasks require decisions at varying granularities, and humans excel at this by leveraging a unified cognitive representation.<n>Current Large Language Model (LLM)-based agents lack this crucial capability to operate fluidly across decision granularities.<n>We propose ReCode, a novel paradigm that addresses this limitation by unifying planning and action within a single code representation.
Score: 35.49121741462282
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real-world tasks require decisions at varying granularities, and humans excel at this by leveraging a unified cognitive representation where planning is fundamentally understood as a high-level form of action. However, current Large Language Model (LLM)-based agents lack this crucial capability to operate fluidly across decision granularities. This limitation stems from existing paradigms that enforce a rigid separation between high-level planning and low-level action, which impairs dynamic adaptability and limits generalization. We propose ReCode (Recursive Code Generation), a novel paradigm that addresses this limitation by unifying planning and action within a single code representation. In this representation, ReCode treats high-level plans as abstract placeholder functions, which the agent then recursively decomposes into finer-grained sub-functions until reaching primitive actions. This recursive approach dissolves the rigid boundary between plan and action, enabling the agent to dynamically control its decision granularity. Furthermore, the recursive structure inherently generates rich, multi-granularity training data, enabling models to learn hierarchical decision-making processes. Extensive experiments show ReCode significantly surpasses advanced baselines in inference performance and demonstrates exceptional data efficiency in training, validating our core insight that unifying planning and action through recursive code generation is a powerful and effective approach to achieving universal granularity control. The code is available at https://github.com/FoundationAgents/ReCode.

Related papers

HiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM Agents [19.63866851076813]
HiMAC is a hierarchical agentic RL framework that decomposes long-horizon decision-making into macro-level planning and micro-level execution.<n>Our results show that introducing structured hierarchy, rather than increasing model scale alone, is a key factor for enabling robust long-horizon agentic intelligence.
arXiv Detail & Related papers (2026-03-01T08:09:03Z)
PseudoAct: Leveraging Pseudocode Synthesis for Flexible Planning and Action Control in Large Language Model Agents [5.5917393750392925]
Large language model (LLM) agents typically rely on reactive decision-making paradigms such as ReAct.<n>This paper introduces PseudoAct, a novel framework for flexible planning and action control in LLM agents through pseudocode synthesis.<n> Experiments on benchmark datasets show that our method significantly outperforms existing reactive agent approaches.
arXiv Detail & Related papers (2026-02-27T04:30:45Z)
Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach [82.27842884709378]
We propose a framework that prioritizes natural language understanding and structured reasoning to enhance the agent's global understanding of the environment.<n>Our method outperforms previous approaches, particularly achieving a 44.4% relative improvement in task success rate.
arXiv Detail & Related papers (2025-05-22T09:08:47Z)
Embodied Long Horizon Manipulation with Closed-loop Code Generation and Incremental Few-shot Adaptation [12.077740860502878]
Embodied long-horizon manipulation requires robotic systems to process multimodal inputs-such as vision and natural language-and translate them into executable actions.<n>Recent methods have explored using large language models (LLMs) as high-level planners that decompose tasks into subtasks using natural language and guide pretrained low-level controllers.<n>Our framework achieves state-of-the-art performance on 30+ diverse seen and unseen long-horizon tasks across LoHoRavens, CALVIN, Franka Kitchen, and cluttered real-world settings.
arXiv Detail & Related papers (2025-03-27T20:32:58Z)
Tree-of-Code: A Hybrid Approach for Robust Complex Task Planning and Execution [3.229241113813517]
We propose a novel approach called Tree-of-Code (ToC) to tackle the challenges of complex problem planning and execution with an end-to-end mechanism.<n>In our framework, each final code execution result is treated as a node in the decision tree, with a breadth-first search strategy employed to explore potential solutions.
arXiv Detail & Related papers (2024-12-18T08:47:17Z)
Action abstractions for amortized sampling [49.384037138511246]
We propose an approach to incorporate the discovery of action abstractions, or high-level actions, into the policy optimization process. Our approach involves iteratively extracting action subsequences commonly used across many high-reward trajectories and chunking' them into a single action that is added to the action space.
arXiv Detail & Related papers (2024-10-19T19:22:50Z)
Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes. CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks. It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z)
Rethinking Decision Transformer via Hierarchical Reinforcement Learning [54.3596066989024]
Decision Transformer (DT) is an innovative algorithm leveraging recent advances of the transformer architecture in reinforcement learning (RL) We introduce a general sequence modeling framework for studying sequential decision making through the lens of Hierarchical RL. We show DT emerges as a special case of this framework with certain choices of high-level and low-level policies, and discuss the potential failure of these choices.
arXiv Detail & Related papers (2023-11-01T03:32:13Z)
Feudal Graph Reinforcement Learning [18.069747511100132]
Graph-based representations and message-passing modular policies constitute prominent approaches to tackling composable control problems in reinforcement learning (RL)<n>We propose a novel methodology, named Feudal Graph Reinforcement Learning (FGRL), that addresses such challenges by relying on hierarchical RL and a pyramidal message-passing architecture.<n>In particular, FGRL defines a hierarchy of policies where high-level commands are propagated from the top of the hierarchy down through a layered graph structure.
arXiv Detail & Related papers (2023-04-11T09:51:13Z)
Goal Kernel Planning: Linearly-Solvable Non-Markovian Policies for Logical Tasks with Goal-Conditioned Options [54.40780660868349]
We introduce a compositional framework called Linearly-Solvable Goal Kernel Dynamic Programming (LS-GKDP)<n>LS-GKDP combines the Linearly-Solvable Markov Decision Process (LMDP) formalism with the Options Framework of Reinforcement Learning.<n>We show how an LMDP with a goal kernel enables the efficient optimization of meta-policies in a lower-dimensional subspace defined by the task grounding.
arXiv Detail & Related papers (2020-07-06T05:13:20Z)
From proprioception to long-horizon planning in novel environments: A hierarchical RL model [4.44317046648898]
In this work, we introduce a simple, three-level hierarchical architecture that reflects different types of reasoning. We apply our method to a series of navigation tasks in the Mujoco Ant environment.
arXiv Detail & Related papers (2020-06-11T17:19:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.