SATA: Safe and Adaptive Torque-Based Locomotion Policies Inspired by Animal Learning
- URL: http://arxiv.org/abs/2502.12674v1
- Date: Tue, 18 Feb 2025 09:25:37 GMT
- Title: SATA: Safe and Adaptive Torque-Based Locomotion Policies Inspired by Animal Learning
- Authors: Peizhuo Li, Hongyi Li, Ge Sun, Jin Cheng, Xinrong Yang, Guillaume Bellegarda, Milad Shafiee, Yuhong Cao, Auke Ijspeert, Guillaume Sartoretti,
- Abstract summary: SATA is a bio-inspired framework that mimics key biomechanical principles and adaptive learning mechanisms observed in animal locomotion.
Our approach effectively addresses the inherent challenges of learning torque-based policies by significantly improving early-stage exploration.
Our experimental results indicate that SATA demonstrates remarkable compliance and safety, even in challenging environments.
- Score: 10.138425472807368
- License:
- Abstract: Despite recent advances in learning-based controllers for legged robots, deployments in human-centric environments remain limited by safety concerns. Most of these approaches use position-based control, where policies output target joint angles that must be processed by a low-level controller (e.g., PD or impedance controllers) to compute joint torques. Although impressive results have been achieved in controlled real-world scenarios, these methods often struggle with compliance and adaptability when encountering environments or disturbances unseen during training, potentially resulting in extreme or unsafe behaviors. Inspired by how animals achieve smooth and adaptive movements by controlling muscle extension and contraction, torque-based policies offer a promising alternative by enabling precise and direct control of the actuators in torque space. In principle, this approach facilitates more effective interactions with the environment, resulting in safer and more adaptable behaviors. However, challenges such as a highly nonlinear state space and inefficient exploration during training have hindered their broader adoption. To address these limitations, we propose SATA, a bio-inspired framework that mimics key biomechanical principles and adaptive learning mechanisms observed in animal locomotion. Our approach effectively addresses the inherent challenges of learning torque-based policies by significantly improving early-stage exploration, leading to high-performance final policies. Remarkably, our method achieves zero-shot sim-to-real transfer. Our experimental results indicate that SATA demonstrates remarkable compliance and safety, even in challenging environments such as soft/slippery terrain or narrow passages, and under significant external disturbances, highlighting its potential for practical deployments in human-centric and safety-critical scenarios.
Related papers
- COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping [56.907940167333656]
Occluded robot grasping is where the desired grasp poses are kinematically infeasible due to environmental constraints such as surface collisions.
Traditional robot manipulation approaches struggle with the complexity of non-prehensile or bimanual strategies commonly used by humans.
We introduce Constraint-based Manipulation for Bimanual Occluded Grasping (COMBO-Grasp), a learning-based approach which leverages two coordinated policies.
arXiv Detail & Related papers (2025-02-12T01:31:01Z) - Bridging Adaptivity and Safety: Learning Agile Collision-Free Locomotion Across Varied Physics [10.408245303948993]
BAS (Bridging Adaptivity and Safety) is designed to provide adaptive safety even in dynamic environments with uncertainties.
We show that BAS achieves 50% better safety than baselines in dynamic environments while maintaining a higher speed on average.
As a result, BAS achieves a 19.8% increase in speed and gets a 2.36 times lower collision rate than ABS in the real world.
arXiv Detail & Related papers (2025-01-08T04:54:28Z) - Safe Policy Exploration Improvement via Subgoals [44.07721205323709]
Reinforcement learning is a widely used approach to autonomous navigation, showing potential in various tasks and robotic setups.
One of the main reasons for poor performance in such setups is that the need to respect the safety constraints degrades the exploration capabilities of an RL agent.
We introduce a novel learnable algorithm that is based on decomposing the initial problem into smaller sub-problems via intermediate goals.
arXiv Detail & Related papers (2024-08-25T16:12:49Z) - RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Integrating DeepRL with Robust Low-Level Control in Robotic Manipulators for Non-Repetitive Reaching Tasks [0.24578723416255746]
In robotics, contemporary strategies are learning-based, characterized by a complex black-box nature and a lack of interpretability.
We propose integrating a collision-free trajectory planner based on deep reinforcement learning (DRL) with a novel auto-tuning low-level control strategy.
arXiv Detail & Related papers (2024-02-04T15:54:03Z) - Safe Deep Policy Adaptation [7.2747306035142225]
Policy adaptation based on reinforcement learning (RL) offers versatility and generalizability but presents safety and robustness challenges.
We propose SafeDPA, a novel RL and control framework that simultaneously tackles the problems of policy adaptation and safe reinforcement learning.
We provide theoretical safety guarantees of SafeDPA and show the robustness of SafeDPA against learning errors and extra perturbations.
arXiv Detail & Related papers (2023-10-08T00:32:59Z) - Learning Variable Impedance Control for Aerial Sliding on Uneven
Heterogeneous Surfaces by Proprioceptive and Tactile Sensing [42.27572349747162]
We present a learning-based adaptive control strategy for aerial sliding tasks.
The proposed controller structure combines data-driven and model-based control methods.
Compared to fine-tuned state of the art interaction control methods we achieve reduced tracking error and improved disturbance rejection.
arXiv Detail & Related papers (2022-06-28T16:28:59Z) - Learning Robust Policy against Disturbance in Transition Dynamics via
State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments.
We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance.
Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z) - Learning Compliance Adaptation in Contact-Rich Manipulation [81.40695846555955]
We propose a novel approach for learning predictive models of force profiles required for contact-rich tasks.
The approach combines an anomaly detection based on Bidirectional Gated Recurrent Units (Bi-GRU) and an adaptive force/impedance controller.
arXiv Detail & Related papers (2020-05-01T05:23:34Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.