Bridging Adaptivity and Safety: Learning Agile Collision-Free Locomotion Across Varied Physics
- URL: http://arxiv.org/abs/2501.04276v3
- Date: Wed, 19 Feb 2025 14:13:51 GMT
- Title: Bridging Adaptivity and Safety: Learning Agile Collision-Free Locomotion Across Varied Physics
- Authors: Yichao Zhong, Chong Zhang, Tairan He, Guanya Shi,
- Abstract summary: BAS (Bridging Adaptivity and Safety) is designed to provide adaptive safety even in dynamic environments with uncertainties.
We show that BAS achieves 50% better safety than baselines in dynamic environments while maintaining a higher speed on average.
As a result, BAS achieves a 19.8% increase in speed and gets a 2.36 times lower collision rate than ABS in the real world.
- Score: 10.408245303948993
- License:
- Abstract: Real-world legged locomotion systems often need to reconcile agility and safety for different scenarios. Moreover, the underlying dynamics are often unknown and time-variant (e.g., payload, friction). In this paper, we introduce BAS (Bridging Adaptivity and Safety), which builds upon the pipeline of prior work Agile But Safe (ABS)(He et al.) and is designed to provide adaptive safety even in dynamic environments with uncertainties. BAS involves an agile policy to avoid obstacles rapidly and a recovery policy to prevent collisions, a physical parameter estimator that is concurrently trained with agile policy, and a learned control-theoretic RA (reach-avoid) value network that governs the policy switch. Also, the agile policy and RA network are both conditioned on physical parameters to make them adaptive. To mitigate the distribution shift issue, we further introduce an on-policy fine-tuning phase for the estimator to enhance its robustness and accuracy. The simulation results show that BAS achieves 50% better safety than baselines in dynamic environments while maintaining a higher speed on average. In real-world experiments, BAS shows its capability in complex environments with unknown physics (e.g., slippery floors with unknown frictions, unknown payloads up to 8kg), while baselines lack adaptivity, leading to collisions or. degraded agility. As a result, BAS achieves a 19.8% increase in speed and gets a 2.36 times lower collision rate than ABS in the real world. Videos: https://adaptive-safe-locomotion.github.io.
Related papers
- SATA: Safe and Adaptive Torque-Based Locomotion Policies Inspired by Animal Learning [10.138425472807368]
SATA is a bio-inspired framework that mimics key biomechanical principles and adaptive learning mechanisms observed in animal locomotion.
Our approach effectively addresses the inherent challenges of learning torque-based policies by significantly improving early-stage exploration.
Our experimental results indicate that SATA demonstrates remarkable compliance and safety, even in challenging environments.
arXiv Detail & Related papers (2025-02-18T09:25:37Z) - RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Learning Speed Adaptation for Flight in Clutter [3.8876619768726157]
Animals learn to adapt speed of their movements to their capabilities and the environment they observe.
Mobile robots should also demonstrate this ability to trade-off aggressiveness and safety for efficiently accomplishing tasks.
This work is to endow flight vehicles with the ability of speed adaptation in prior unknown and partially observable cluttered environments.
arXiv Detail & Related papers (2024-03-07T15:30:54Z) - Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion [13.647294304606316]
This paper introduces Agile But Safe (ABS), a learning-based control framework for quadrupedal robots.
ABS involves an agile policy to execute agile motor skills amidst obstacles and a recovery policy to prevent failures.
The training process involves the learning of the agile policy, the reach-avoid value network, the recovery policy, and an exteroception representation network.
arXiv Detail & Related papers (2024-01-31T03:58:28Z) - Safe Deep Policy Adaptation [7.2747306035142225]
Policy adaptation based on reinforcement learning (RL) offers versatility and generalizability but presents safety and robustness challenges.
We propose SafeDPA, a novel RL and control framework that simultaneously tackles the problems of policy adaptation and safe reinforcement learning.
We provide theoretical safety guarantees of SafeDPA and show the robustness of SafeDPA against learning errors and extra perturbations.
arXiv Detail & Related papers (2023-10-08T00:32:59Z) - Runtime Stealthy Perception Attacks against DNN-based Adaptive Cruise Control Systems [8.561553195784017]
This paper evaluates the security of the deep neural network based ACC systems under runtime perception attacks.
We present a context-aware strategy for the selection of the most critical times for triggering the attacks.
We evaluate the effectiveness of the proposed attack using an actual vehicle, a publicly available driving dataset, and a realistic simulation platform.
arXiv Detail & Related papers (2023-07-18T03:12:03Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - OSCAR: Data-Driven Operational Space Control for Adaptive and Robust
Robot Manipulation [50.59541802645156]
Operational Space Control (OSC) has been used as an effective task-space controller for manipulation.
We propose OSC for Adaptation and Robustness (OSCAR), a data-driven variant of OSC that compensates for modeling errors.
We evaluate our method on a variety of simulated manipulation problems, and find substantial improvements over an array of controller baselines.
arXiv Detail & Related papers (2021-10-02T01:21:38Z) - Learning to be Safe: Deep RL with a Safety Critic [72.00568333130391]
A natural first approach toward safe RL is to manually specify constraints on the policy's behavior.
We propose to learn how to be safe in one set of tasks and environments, and then use that learned intuition to constrain future behaviors.
arXiv Detail & Related papers (2020-10-27T20:53:20Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z) - Model-Based Meta-Reinforcement Learning for Flight with Suspended
Payloads [69.21503033239985]
Transporting suspended payloads is challenging for autonomous aerial vehicles.
We propose a meta-learning approach that "learns how to learn" models of altered dynamics within seconds of post-connection flight data.
arXiv Detail & Related papers (2020-04-23T17:43:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.