Related papers: Safety Aware Task Planning via Large Language Models in Robotics

Safety Aware Task Planning via Large Language Models in Robotics

URL: http://arxiv.org/abs/2503.15707v1
Date: Wed, 19 Mar 2025 21:41:10 GMT
Title: Safety Aware Task Planning via Large Language Models in Robotics
Authors: Azal Ahmad Khan, Michael Andrev, Muhammad Ali Murtaza, Sergio Aguilera, Rui Zhang, Jie Ding, Seth Hutchinson, Ali Anwar,
Abstract summary: This paper introduces SAFER (Safety-Aware Framework for Execution in Robotics), a multi-LLM framework designed to embed safety awareness into robotic task planning.<n>Our framework integrates safety feedback at multiple stages of execution, enabling real-time risk assessment, proactive error correction, and transparent safety evaluation.
Score: 22.72668275829238
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The integration of large language models (LLMs) into robotic task planning has unlocked better reasoning capabilities for complex, long-horizon workflows. However, ensuring safety in LLM-driven plans remains a critical challenge, as these models often prioritize task completion over risk mitigation. This paper introduces SAFER (Safety-Aware Framework for Execution in Robotics), a multi-LLM framework designed to embed safety awareness into robotic task planning. SAFER employs a Safety Agent that operates alongside the primary task planner, providing safety feedback. Additionally, we introduce LLM-as-a-Judge, a novel metric leveraging LLMs as evaluators to quantify safety violations within generated task plans. Our framework integrates safety feedback at multiple stages of execution, enabling real-time risk assessment, proactive error correction, and transparent safety evaluation. We also integrate a control framework using Control Barrier Functions (CBFs) to ensure safety guarantees within SAFER's task planning. We evaluated SAFER against state-of-the-art LLM planners on complex long-horizon tasks involving heterogeneous robotic agents, demonstrating its effectiveness in reducing safety violations while maintaining task efficiency. We also verify the task planner and safety planner through actual hardware experiments involving multiple robots and a human.

Related papers

AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions [76.74726258534142]
We propose AGENTSAFE, the first benchmark for evaluating the safety of embodied VLM agents under hazardous instructions.<n> AGENTSAFE simulates realistic agent-environment interactions within a simulation sandbox.<n> benchmark includes 45 adversarial scenarios, 1,350 hazardous tasks, and 8,100 hazardous instructions.
arXiv Detail & Related papers (2025-06-17T16:37:35Z)
SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator [77.86600052899156]
Large Language Model (LLM)-based agents are increasingly deployed in real-world applications.<n>We propose AutoSafe, the first framework that systematically enhances agent safety through fully automated synthetic data generation.<n>We show that AutoSafe boosts safety scores by 45% on average and achieves a 28.91% improvement on real-world tasks.
arXiv Detail & Related papers (2025-05-23T10:56:06Z)
A Framework for Benchmarking and Aligning Task-Planning Safety in LLM-Based Embodied Agents [13.225168384790257]
Large Language Models (LLMs) exhibit substantial promise in enhancing task-planning capabilities within embodied agents. We present Safe-BeAl, an integrated framework for the measurement (SafePlan-Bench) and alignment (Safe-Align) of LLM-based embodied agents' behaviors. Our empirical analysis reveals that even in the absence of adversarial inputs or malicious intent, LLM-based agents can exhibit unsafe behaviors.
arXiv Detail & Related papers (2025-04-20T15:12:14Z)
Graphormer-Guided Task Planning: Beyond Static Rules with LLM Safety Perception [4.424170214926035]
We propose a risk-aware task planning framework that combines large language models with structured safety modeling.<n>Our approach constructs a dynamic-semantic safety graph, capturing spatial and contextual risk factors.<n>Unlike existing methods that rely on predefined safety constraints, our framework introduces a context-aware risk perception module.
arXiv Detail & Related papers (2025-03-10T02:43:54Z)
Safe LLM-Controlled Robots with Formal Guarantees via Reachability Analysis [0.6749750044497732]
This paper introduces a safety assurance framework for Large Language Models (LLMs)-controlled robots based on data-driven reachability analysis.<n>Our approach provides rigorous safety guarantees against unsafe behaviors without relying on explicit analytical models.
arXiv Detail & Related papers (2025-03-05T21:23:15Z)
AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection [47.83354878065321]
We propose AGrail, a lifelong guardrail to enhance agent safety.<n>AGrail features adaptive safety check generation, effective safety check optimization, and tool compatibility and flexibility.
arXiv Detail & Related papers (2025-02-17T05:12:33Z)
AgentGuard: Repurposing Agentic Orchestrator for Safety Evaluation of Tool Orchestration [0.3222802562733787]
AgentGuard is a framework to autonomously discover and validate unsafe tool-use.<n>It generates safety constraints to confine the behaviors of agents, achieving the baseline of safety guarantee.<n>The framework operates through four phases: identifying unsafe, validating them in real-world execution, generating safety constraints, and validating constraint efficacy.
arXiv Detail & Related papers (2025-02-13T23:00:33Z)
Global Challenge for Safe and Secure LLMs Track 1 [57.08717321907755]
The Global Challenge for Safe and Secure Large Language Models (LLMs) is a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) This paper introduces the Global Challenge for Safe and Secure Large Language Models (LLMs), a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) to foster the development of advanced defense mechanisms against automated jailbreaking attacks.
arXiv Detail & Related papers (2024-11-21T08:20:31Z)
Defining and Evaluating Physical Safety for Large Language Models [62.4971588282174]
Large Language Models (LLMs) are increasingly used to control robotic systems such as drones. Their risks of causing physical threats and harm in real-world applications remain unexplored. We classify the physical safety risks of drones into four categories: (1) human-targeted threats, (2) object-targeted threats, (3) infrastructure attacks, and (4) regulatory violations.
arXiv Detail & Related papers (2024-11-04T17:41:25Z)
SCANS: Mitigating the Exaggerated Safety for LLMs via Safety-Conscious Activation Steering [56.92068213969036]
Safety alignment is indispensable for Large Language Models (LLMs) to defend threats from malicious instructions. Recent researches reveal safety-aligned LLMs prone to reject benign queries due to the exaggerated safety issue. We propose a Safety-Conscious Activation Steering (SCANS) method to mitigate the exaggerated safety concerns.
arXiv Detail & Related papers (2024-08-21T10:01:34Z)
On the Vulnerability of LLM/VLM-Controlled Robotics [54.57914943017522]
We highlight vulnerabilities in robotic systems integrating large language models (LLMs) and vision-language models (VLMs) due to input modality sensitivities.<n>Our results show that simple input perturbations reduce task execution success rates by 22.2% and 14.6% in two representative LLM/VLM-controlled robotic systems.
arXiv Detail & Related papers (2024-02-15T22:01:45Z)
TrustAgent: Towards Safe and Trustworthy LLM-based Agents [50.33549510615024]
This paper presents an Agent-Constitution-based agent framework, TrustAgent, with a focus on improving the LLM-based agent safety. The proposed framework ensures strict adherence to the Agent Constitution through three strategic components: pre-planning strategy which injects safety knowledge to the model before plan generation, in-planning strategy which enhances safety during plan generation, and post-planning strategy which ensures safety by post-planning inspection.
arXiv Detail & Related papers (2024-02-02T17:26:23Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.