A Unified Framework for Real-Time Failure Handling in Robotics Using Vision-Language Models, Reactive Planner and Behavior Trees
- URL: http://arxiv.org/abs/2503.15202v2
- Date: Fri, 21 Mar 2025 08:10:48 GMT
- Title: A Unified Framework for Real-Time Failure Handling in Robotics Using Vision-Language Models, Reactive Planner and Behavior Trees
- Authors: Faseeh Ahmad, Hashim Ismail, Jonathan Styrud, Maj Stenmark, Volker Krueger,
- Abstract summary: This paper presents a unified failure recovery framework that combines Vision-Language Models (VLMs), a reactive planner, and Behavior Trees (BTs) to enable real-time failure handling.<n>Our approach includes pre-execution verification, which checks for potential failures before execution, and reactive failure handling, which detects and corrects failures during execution.<n>We evaluate our framework through real-world experiments with an ABB YuMi robot on tasks like peg insertion, object sorting, and drawer placement.
- Score: 1.3481665321936716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robotic systems often face execution failures due to unexpected obstacles, sensor errors, or environmental changes. Traditional failure recovery methods rely on predefined strategies or human intervention, making them less adaptable. This paper presents a unified failure recovery framework that combines Vision-Language Models (VLMs), a reactive planner, and Behavior Trees (BTs) to enable real-time failure handling. Our approach includes pre-execution verification, which checks for potential failures before execution, and reactive failure handling, which detects and corrects failures during execution by verifying existing BT conditions, adding missing preconditions and, when necessary, generating new skills. The framework uses a scene graph for structured environmental perception and an execution history for continuous monitoring, enabling context-aware and adaptive failure handling. We evaluate our framework through real-world experiments with an ABB YuMi robot on tasks like peg insertion, object sorting, and drawer placement, as well as in AI2-THOR simulator. Compared to using pre-execution and reactive methods separately, our approach achieves higher task success rates and greater adaptability. Ablation studies highlight the importance of VLM-based reasoning, structured scene representation, and execution history tracking for effective failure recovery in robotics.
Related papers
- Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks.
We introduce a generative framework leveraging flow matching for online robot dynamics model alignment.
We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z) - Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection [56.66677293607114]
We propose Code-as-Monitor (CaM) for both open-set reactive and proactive failure detection.<n>To enhance the accuracy and efficiency of monitoring, we introduce constraint elements that abstract constraint-related entities.<n>Experiments show that CaM achieves a 28.7% higher success rate and reduces execution time by 31.8% under severe disturbances.
arXiv Detail & Related papers (2024-12-05T18:58:27Z) - TI-PREGO: Chain of Thought and In-Context Learning for Online Mistake Detection in PRocedural EGOcentric Videos [48.126793563151715]
No technique effectively detects open-set procedural mistakes online.
One branch continuously performs step recognition from the input egocentric video.
The other anticipates future steps based on the recognition module's output.
arXiv Detail & Related papers (2024-11-04T20:03:06Z) - Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation [49.53202761595912]
Continual Test-Time Adaptation involves adapting a pre-trained source model to continually changing unsupervised target domains.
We analyze the challenges of this task: online environment, unsupervised nature, and the risks of error accumulation and catastrophic forgetting.
We propose an uncertainty-aware buffering approach to identify and aggregate significant samples with high certainty from the unsupervised, single-pass data stream.
arXiv Detail & Related papers (2024-07-12T15:48:40Z) - Learning to Recover from Plan Execution Errors during Robot Manipulation: A Neuro-symbolic Approach [7.768747914019512]
We propose an approach (blending learning with symbolic search) for automated error discovery and recovery.
We present an anytime version of our algorithm, where instead of recovering to the last correct state, we search for a sub-goal in the original plan.
arXiv Detail & Related papers (2024-05-29T10:03:57Z) - A Self-Correcting Vision-Language-Action Model for Fast and Slow System Manipulation [30.207690822989292]
Self-corrected (SC-)VLA framework integrates fast system for directly predicting actions and slow system for reflecting on failed actions.
For the fast system, we incorporate parameter-efficient fine-tuning to equip the model with pose prediction capabilities.
For the slow system, we propose a Chain-of-Thought training strategy for failure correction, designed to mimic human reflection after a manipulation failure.
arXiv Detail & Related papers (2024-05-27T17:58:48Z) - Adaptable Recovery Behaviors in Robotics: A Behavior Trees and Motion Generators(BTMG) Approach for Failure Management [0.0]
We propose a novel approach that models recovery behaviors as adaptable robotic skills, leveraging the Behavior Trees and Motion Generators(BTMG) framework for policy representation.
We assess our methodology through a series of progressively challenging scenarios within a peg-in-a-hole task, demonstrating the approach's effectiveness in enhancing operational efficiency and task success rates in collaborative robotics settings.
arXiv Detail & Related papers (2024-04-09T08:56:43Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - Model-Based Runtime Monitoring with Interactive Imitation Learning [30.70994322652745]
This work aims to endow a robot with the ability to monitor and detect errors during task execution.
We introduce a model-based runtime monitoring algorithm that learns from deployment data to detect system anomalies and anticipate failures.
Our method outperforms the baselines across system-level and unit-test metrics, with 23% and 40% higher success rates in simulation and on physical hardware.
arXiv Detail & Related papers (2023-10-26T16:45:44Z) - Towards Robust Continual Learning with Bayesian Adaptive Moment Regularization [51.34904967046097]
Continual learning seeks to overcome the challenge of catastrophic forgetting, where a model forgets previously learnt information.
We introduce a novel prior-based method that better constrains parameter growth, reducing catastrophic forgetting.
Results show that BAdam achieves state-of-the-art performance for prior-based methods on challenging single-headed class-incremental experiments.
arXiv Detail & Related papers (2023-09-15T17:10:51Z) - Competence-Aware Path Planning via Introspective Perception [36.39015240656877]
We propose a structured model-free approach to competence-aware planning by reasoning about plan execution failures due to errors in perception.
We introduce competence-aware path planning via introspective perception (CPIP), a Bayesian framework to iteratively learn and exploit task-level competence.
arXiv Detail & Related papers (2021-09-28T18:29:21Z) - Adversarial vs behavioural-based defensive AI with joint, continual and
active learning: automated evaluation of robustness to deception, poisoning
and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security.
In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.