Dynamic Obstacle Avoidance with Bounded Rationality Adversarial Reinforcement Learning
- URL: http://arxiv.org/abs/2503.11467v1
- Date: Fri, 14 Mar 2025 14:54:02 GMT
- Title: Dynamic Obstacle Avoidance with Bounded Rationality Adversarial Reinforcement Learning
- Authors: Jose-Luis Holgado-Alvarez, Aryaman Reddi, Carlo D'Eramo,
- Abstract summary: We propose a novel way to endow navigation policies with robustness by a training process that models obstacles as adversarial agents.<n>We call this method versa policies via Quantal response Adrial Reinforcement Learning (Hi-QARL)
- Score: 5.760394464143113
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement Learning (RL) has proven largely effective in obtaining stable locomotion gaits for legged robots. However, designing control algorithms which can robustly navigate unseen environments with obstacles remains an ongoing problem within quadruped locomotion. To tackle this, it is convenient to solve navigation tasks by means of a hierarchical approach with a low-level locomotion policy and a high-level navigation policy. Crucially, the high-level policy needs to be robust to dynamic obstacles along the path of the agent. In this work, we propose a novel way to endow navigation policies with robustness by a training process that models obstacles as adversarial agents, following the adversarial RL paradigm. Importantly, to improve the reliability of the training process, we bound the rationality of the adversarial agent resorting to quantal response equilibria, and place a curriculum over its rationality. We called this method Hierarchical policies via Quantal response Adversarial Reinforcement Learning (Hi-QARL). We demonstrate the robustness of our method by benchmarking it in unseen randomized mazes with multiple obstacles. To prove its applicability in real scenarios, our method is applied on a Unitree GO1 robot in simulation.
Related papers
- Multi-Objective Algorithms for Learning Open-Ended Robotic Problems [1.0124625066746598]
Quadrupedal locomotion is a complex, open-ended problem vital to expanding autonomous vehicle reach.
Traditional reinforcement learning approaches often fall short due to training instability and sample inefficiency.
We propose a novel method leveraging multi-objective evolutionary algorithms as an automatic curriculum learning mechanism.
arXiv Detail & Related papers (2024-11-11T16:26:42Z) - Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation [12.377289165111028]
Reinforcement learning (RL) often necessitates a meticulous Markov Decision Process (MDP) design tailored to each task.
This work proposes a systematic approach to behavior synthesis and control for multi-contact loco-manipulation tasks.
We define a task-independent MDP to train RL policies using only a single demonstration per task generated from a model-based trajectory.
arXiv Detail & Related papers (2024-10-17T17:46:27Z) - Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning [53.3760591018817]
We propose a new benchmarking environment for aquatic navigation using recent advances in the integration between game engines and Deep Reinforcement Learning.
Specifically, we focus on PPO, one of the most widely accepted algorithms, and we propose advanced training techniques.
Our empirical evaluation shows that a well-designed combination of these ingredients can achieve promising results.
arXiv Detail & Related papers (2024-05-30T23:20:23Z) - RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression [53.33734159983431]
This paper introduces a novel approach to distill neural RL policies into more interpretable forms.
We train expert neural network policies using RL and distill them into (i) GBMs, (ii) EBMs, and (iii) symbolic policies.
arXiv Detail & Related papers (2024-03-21T11:54:45Z) - Nonprehensile Planar Manipulation through Reinforcement Learning with
Multimodal Categorical Exploration [8.343657309038285]
Reinforcement Learning is a powerful framework for developing such robot controllers.
We propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies.
We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers.
arXiv Detail & Related papers (2023-08-04T16:55:00Z) - Enhanced method for reinforcement learning based dynamic obstacle
avoidance by assessment of collision risk [0.0]
This paper proposes a general training environment where we gain control over the difficulty of the obstacle avoidance task.
We found that shifting the training towards a greater task difficulty can massively increase the final performance.
arXiv Detail & Related papers (2022-12-08T07:46:42Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - Learning to Walk by Steering: Perceptive Quadrupedal Locomotion in
Dynamic Environments [25.366480092589022]
A quadrupedal robot must exhibit robust and agile walking behaviors in response to environmental clutter and moving obstacles.
We present a hierarchical learning framework, named PRELUDE, which decomposes the problem of perceptive locomotion into high-level decision-making.
We demonstrate the effectiveness of our approach in simulation and with hardware experiments.
arXiv Detail & Related papers (2022-09-19T17:55:07Z) - XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision
Trees [55.9643422180256]
We present a novel sensor-based learning navigation algorithm to compute a collision-free trajectory for a robot in dense and dynamic environments.
Our approach uses deep reinforcement learning-based expert policy that is trained using a sim2real paradigm.
We highlight the benefits of our algorithm in simulated environments and navigating a Clearpath Jackal robot among moving pedestrians.
arXiv Detail & Related papers (2021-04-22T01:33:10Z) - Reinforcement Learning for Robust Parameterized Locomotion Control of
Bipedal Robots [121.42930679076574]
We present a model-free reinforcement learning framework for training robust locomotion policies in simulation.
domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics.
We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw.
arXiv Detail & Related papers (2021-03-26T07:14:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.