Related papers: Scene Graph-Guided Proactive Replanning for Failure-Resilient Embodied Agent

Scene Graph-Guided Proactive Replanning for Failure-Resilient Embodied Agent

URL: http://arxiv.org/abs/2508.11286v1
Date: Fri, 15 Aug 2025 07:48:51 GMT
Title: Scene Graph-Guided Proactive Replanning for Failure-Resilient Embodied Agent
Authors: Che Rin Yu, Daewon Chae, Dabin Seo, Sangwon Lee, Hyeongwoo Im, Jinkyu Kim,
Abstract summary: We present a proactive replanning framework that detects and corrects failures at subtask boundaries.<n>Experiments in the AI2-THOR simulator demonstrate that our approach detects semantic and spatial mismatches before execution failures occur.
Score: 9.370683025542686
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When humans perform everyday tasks, we naturally adjust our actions based on the current state of the environment. For instance, if we intend to put something into a drawer but notice it is closed, we open it first. However, many autonomous robots lack this adaptive awareness. They often follow pre-planned actions that may overlook subtle yet critical changes in the scene, which can result in actions being executed under outdated assumptions and eventual failure. While replanning is critical for robust autonomy, most existing methods respond only after failures occur, when recovery may be inefficient or infeasible. While proactive replanning holds promise for preventing failures in advance, current solutions often rely on manually designed rules and extensive supervision. In this work, we present a proactive replanning framework that detects and corrects failures at subtask boundaries by comparing scene graphs constructed from current RGB-D observations against reference graphs extracted from successful demonstrations. When the current scene fails to align with reference trajectories, a lightweight reasoning module is activated to diagnose the mismatch and adjust the plan. Experiments in the AI2-THOR simulator demonstrate that our approach detects semantic and spatial mismatches before execution failures occur, significantly improving task success and robustness.

Related papers

When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents [50.5814495434565]
This work makes the first effort to define and study misaligned action detection in computer-use agents (CUAs)<n>We identify three common categories in real-world CUA deployment and construct MisActBench, a benchmark of realistic trajectories with human-annotated, action-level alignment labels.<n>We propose DeAction, a practical and universal guardrail that detects misaligned actions before execution and iteratively corrects them through structured feedback.
arXiv Detail & Related papers (2026-02-09T18:41:15Z)
Hierarchical Vision Language Action Model Using Success and Failure Demonstrations [60.82332413442677]
We introduce VINE, a hierarchical vision-language-action model that separates high-level reasoning from low-level control.<n>System 2 performs feasibility-guided tree search over a 2D scene-graph abstraction.<n>System 1 executes low-level actions without modifying the agent's core skills.
arXiv Detail & Related papers (2025-12-03T15:58:38Z)
Guardian: Detecting Robotic Planning and Execution Errors with Vision-Language Models [53.20969621498248]
We propose an automatic robot failure synthesis approach that procedurally perturbs successful trajectories to generate diverse planning and execution failures.<n>We construct three new failure detection benchmarks: RLBench-Fail, BridgeDataV2-Fail, and UR5-Fail.<n>We then train Guardian, a VLM with multi-view images for detailed failure reasoning and detection.
arXiv Detail & Related papers (2025-12-01T17:57:27Z)
Building a Foundational Guardrail for General Agentic Systems via Synthetic Data [76.18834864749606]
LLM agents can plan multi-step tasks, intervening at the planning stage-before any action is executed-is often the safest way to prevent harm.<n>Existing guardrails mostly operate post-execution, which is difficult to scale and leaves little room for controllable supervision at the plan level.<n>We introduce AuraGen, a controllable engine that synthesizes benign trajectories, injects category-labeled risks with difficulty, and filters outputs via an automated reward model.
arXiv Detail & Related papers (2025-10-10T18:42:32Z)
Failure Prediction at Runtime for Generative Robot Policies [6.375597233389154]
Early failure prediction during runtime is essential for deploying robots in human-centered and safety-critical environments.<n>We propose FIPER, a framework for failure prediction for generative robot policies that does not require failure data.<n>Our results demonstrate that FIPER better distinguishes actual failures from benign OOD situations and predicts failures more accurately and earlier than existing methods.
arXiv Detail & Related papers (2025-10-10T15:09:27Z)
A Unified Framework for Real-Time Failure Handling in Robotics Using Vision-Language Models, Reactive Planner and Behavior Trees [1.3481665321936716]
This paper presents a unified failure recovery framework that combines Vision-Language Models (VLMs), a reactive planner, and Behavior Trees (BTs) to enable real-time failure handling.<n>Our approach includes pre-execution verification, which checks for potential failures before execution, and reactive failure handling, which detects and corrects failures during execution.<n>We evaluate our framework through real-world experiments with an ABB YuMi robot on tasks like peg insertion, object sorting, and drawer placement.
arXiv Detail & Related papers (2025-03-19T13:40:56Z)
Centaur: Robust End-to-End Autonomous Driving with Test-Time Training [84.78837437133234]
We propose Centaur, which updates a planner's behavior via test-time training without relying on hand-engineered rules or cost functions.<n>We develop a novel uncertainty measure, called Cluster Entropy, which is simple, interpretable, and compatible with state-of-the-art planning algorithms.
arXiv Detail & Related papers (2025-03-14T17:59:41Z)
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection [56.66677293607114]
We propose Code-as-Monitor (CaM) for both open-set reactive and proactive failure detection.<n>To enhance the accuracy and efficiency of monitoring, we introduce constraint elements that abstract constraint-related entities.<n>Experiments show that CaM achieves a 28.7% higher success rate and reduces execution time by 31.8% under severe disturbances.
arXiv Detail & Related papers (2024-12-05T18:58:27Z)
Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling [51.38330727868982]
We show how action chunking impacts the divergence between a learner and a demonstrator.<n>We propose Bidirectional Decoding (BID), a test-time inference algorithm that bridges action chunking with closed-loop adaptation.<n>Our method boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z)
Learning to Recover from Plan Execution Errors during Robot Manipulation: A Neuro-symbolic Approach [7.768747914019512]
We propose an approach (blending learning with symbolic search) for automated error discovery and recovery. We present an anytime version of our algorithm, where instead of recovering to the last correct state, we search for a sub-goal in the original plan.
arXiv Detail & Related papers (2024-05-29T10:03:57Z)
Model Checking for Closed-Loop Robot Reactive Planning [0.0]
We show how model checking can be used to create multistep plans for a differential drive wheeled robot so that it can avoid immediate danger. Using a small, purpose built model checking algorithm in situ we generate plans in real-time in a way that reflects the egocentric reactive response of simple biological agents.
arXiv Detail & Related papers (2023-11-16T11:02:29Z)
Enhancing Lattice-based Motion Planning with Introspective Learning and Reasoning [3.2689702143620143]
This work is concerned with introspective learning and reasoning about controller performance over time. Normal controller execution of the different actions is learned using reliable and uncertainty-aware machine learning techniques. Reasoning takes place to both verify that the learned models stays safe and to improve collision checking effectiveness in the motion planner.
arXiv Detail & Related papers (2020-05-15T07:16:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.