Related papers: PADME: Procedure Aware DynaMic Execution

PADME: Procedure Aware DynaMic Execution

URL: http://arxiv.org/abs/2510.11281v1
Date: Mon, 13 Oct 2025 11:15:49 GMT
Title: PADME: Procedure Aware DynaMic Execution
Authors: Deepeka Garg, Sihan Zeng, Annapoorani L. Narayanan, Sumitra Ganesh, Leo Ardon,
Abstract summary: We introduce Procedure Aware DynaMic Execution (PADME), an agent framework that produces and exploits a graph-based representation of procedures.<n>Unlike prior work that relies on manual graph construction or unstructured reasoning, PADME autonomously transforms procedural text into executable graphs.<n>PADME achieves state-of-the-art performance on four diverse benchmarks, including ALFWorld and ScienceWorld.
Score: 7.8148770419284865
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning to autonomously execute long-horizon procedures from natural language remains a core challenge for intelligent agents. Free-form instructions such as recipes, scientific protocols, or business workflows encode rich procedural knowledge, but their variability and lack of structure cause agents driven by large language models (LLMs) to drift or fail during execution. We introduce Procedure Aware DynaMic Execution (PADME), an agent framework that produces and exploits a graph-based representation of procedures. Unlike prior work that relies on manual graph construction or unstructured reasoning, PADME autonomously transforms procedural text into executable graphs that capture task dependencies, decision points, and reusable subroutines. Central to PADME is a two-phase methodology; Teach phase, which focuses on systematic structuring, enrichment with executable logic of procedures, followed by Execute phase, which enables dynamic execution in response to real-time inputs and environment feedback. This separation ensures quality assurance and scalability, allowing expert knowledge to be encoded once and reliably reused across varying contexts. The graph representation also provides an inductive bias that reduces error accumulation in long-horizon reasoning, underscoring the importance of structured procedure modeling for reliable agent-driven automation. Empirically, PADME achieves state-of-the-art performance on four diverse benchmarks, including ALFWorld and ScienceWorld. These results demonstrate that agents equipped with graph-based procedure representations offer a powerful intermediate abstraction for robust and generalizable execution.

Related papers

The Auton Agentic AI Framework [5.410458076724158]
The field of Artificial Intelligence is undergoing a transition from Generative AI to Agentic AI.<n>This transition exposes a fundamental architectural mismatch: Large Language Models (LLMs) produce unstructured outputs, whereas the backend infrastructure they must control requires deterministic, schema-conformant inputs.<n>The present paper describes the Auton Agentic AI Framework, a principled architecture for the creation, creation, and governance of autonomous agent.
arXiv Detail & Related papers (2026-02-27T06:42:08Z)
El Agente Gráfico: Structured Execution Graphs for Scientific Agents [7.47895130442454]
We present El Agente Grfico, a single-agent framework that embeds large language models (LLMs)-driven decision-making within a type-safe execution environment.<n>Central to our approach is a structured abstraction of scientific concepts and an object-graph mapper that represents computational state as typed Python objects.<n>We evaluate the system by developing an automated benchmarking framework across a suite of university-level quantum chemistry tasks.
arXiv Detail & Related papers (2026-02-19T23:47:05Z)
Multi-Agent Procedural Graph Extraction with Structural and Logical Refinement [66.51979814832332]
model formulates procedural graph extraction as a multi-round reasoning process with dedicated structural and logical refinement.<n>Experiments demonstrate that model achieves substantial improvements in both structural correctness and logical consistency over strong baselines.
arXiv Detail & Related papers (2026-01-27T04:00:48Z)
Monadic Context Engineering [59.95390010097654]
This paper introduces Monadic Context Engineering (MCE) to provide a formal foundation for agent design.<n>We demonstrate how Monads enable robust composition, how Applicatives provide a principled structure for parallel execution, and crucially, how Monad Transformers allow for the systematic composition of these capabilities.<n>This layered approach enables developers to construct complex, resilient, and efficient AI agents from simple, independently verifiable components.
arXiv Detail & Related papers (2025-12-27T01:52:06Z)
Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection [59.04089915447622]
ForenAgent is an interactive IFD framework that enables MLLMs to autonomously generate, execute, and refine Python-based low-level tools around the detection objective.<n>Inspired by human reasoning, we design a dynamic reasoning loop comprising global perception, local focusing, iterative probing, and holistic adjudication.<n>Experiments show that ForenAgent exhibits emergent tool-use competence and reflective reasoning on challenging IFD tasks.
arXiv Detail & Related papers (2025-12-18T08:38:44Z)
AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress [71.02263260394261]
Large language models (LLMs) still encounter challenges in multi-turn decision-making tasks.<n>We build process reward models (PRMs) to evaluate each decision and guide the agent's decision-making process.<n>AgentPRM captures both the interdependence between sequential decisions and their contribution to the final goal.
arXiv Detail & Related papers (2025-11-11T14:57:54Z)
Blueprint First, Model Second: A Framework for Deterministic LLM Workflow [3.9886771197662925]
We introduce the Source Code Agent framework, a new paradigm built on the "Blueprint First, Model Second" philosophy.<n>Our framework decouples the workflow logic from the generative model.<n>Our work enables the verifiable and reliable deployment of autonomous agents in applications governed by strict procedural logic.
arXiv Detail & Related papers (2025-08-01T03:10:00Z)
State and Memory is All You Need for Robust and Reliable AI Agents [29.259008600842517]
Large language models (LLMs) have enabled powerful advances in natural language understanding and generation.<n>Yet their application to complex, real-world scientific remain limited by challenges in memory, planning, and tool integration.<n>Here, we introduce SciBORG, a modular agentic framework that allows LLM-based agents to autonomously plan, reason, and achieve robust and reliable domain-specific task execution.
arXiv Detail & Related papers (2025-06-30T02:02:35Z)
Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach [82.27842884709378]
We propose a framework that prioritizes natural language understanding and structured reasoning to enhance the agent's global understanding of the environment.<n>Our method outperforms previous approaches, particularly achieving a 44.4% relative improvement in task success rate.
arXiv Detail & Related papers (2025-05-22T09:08:47Z)
Generating Structured Plan Representation of Procedures with LLMs [5.623006055588189]
We introduce SOP Structuring ( SOPStruct), a novel approach to transform SOPs into structured representations.<n> SOPStruct produces a standardized representation of SOPs across different domains, reduces cognitive load, and improves user comprehension.<n>Our research highlights the transformative potential of Large Language Models to streamline process modeling.
arXiv Detail & Related papers (2025-03-28T22:38:24Z)
Learning Task Representations from In-Context Learning [73.72066284711462]
Large language models (LLMs) have demonstrated remarkable proficiency in in-context learning.<n>We introduce an automated formulation for encoding task information in ICL prompts as a function of attention heads.<n>We show that our method's effectiveness stems from aligning the distribution of the last hidden state with that of an optimally performing in-context-learned model.
arXiv Detail & Related papers (2025-02-08T00:16:44Z)
Procedures as Programs: Hierarchical Control of Situated Agents through Natural Language [81.73820295186727]
We propose a formalism of procedures as programs, a powerful yet intuitive method of representing hierarchical procedural knowledge for agent command and control. We instantiate this framework on the IQA and ALFRED datasets for NL instruction following.
arXiv Detail & Related papers (2021-09-16T20:36:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.