Reinforcement Learning with Evolutionary Trajectory Generator: A General
Approach for Quadrupedal Locomotion
- URL: http://arxiv.org/abs/2109.06409v2
- Date: Thu, 16 Sep 2021 12:42:50 GMT
- Title: Reinforcement Learning with Evolutionary Trajectory Generator: A General
Approach for Quadrupedal Locomotion
- Authors: Haojie Shi, Bo Zhou, Hongsheng Zeng, Fan Wang, Yueqiang Dong,
Jiangyong Li, Kang Wang, Hao Tian, Max Q.-H. Meng
- Abstract summary: We propose a novel RL-based approach that contains an evolutionary foot trajectory generator.
The generator continually optimize the shape of the output trajectory for the given task, providing diversified motion priors to guide the policy learning.
We deploy the controller learned in the simulation on a 12-DoF quadrupedal robot, and it can successfully traverse challenging scenarios with efficient gaits.
- Score: 29.853927354893656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently reinforcement learning (RL) has emerged as a promising approach for
quadrupedal locomotion, which can save the manual effort in conventional
approaches such as designing skill-specific controllers. However, due to the
complex nonlinear dynamics in quadrupedal robots and reward sparsity, it is
still difficult for RL to learn effective gaits from scratch, especially in
challenging tasks such as walking over the balance beam. To alleviate such
difficulty, we propose a novel RL-based approach that contains an evolutionary
foot trajectory generator. Unlike prior methods that use a fixed trajectory
generator, the generator continually optimizes the shape of the output
trajectory for the given task, providing diversified motion priors to guide the
policy learning. The policy is trained with reinforcement learning to output
residual control signals that fit different gaits. We then optimize the
trajectory generator and policy network alternatively to stabilize the training
and share the exploratory data to improve sample efficiency. As a result, our
approach can solve a range of challenging tasks in simulation by learning from
scratch, including walking on a balance beam and crawling through the cave. To
further verify the effectiveness of our approach, we deploy the controller
learned in the simulation on a 12-DoF quadrupedal robot, and it can
successfully traverse challenging scenarios with efficient gaits.
Related papers
- Multi-Objective Algorithms for Learning Open-Ended Robotic Problems [1.0124625066746598]
Quadrupedal locomotion is a complex, open-ended problem vital to expanding autonomous vehicle reach.
Traditional reinforcement learning approaches often fall short due to training instability and sample inefficiency.
We propose a novel method leveraging multi-objective evolutionary algorithms as an automatic curriculum learning mechanism.
arXiv Detail & Related papers (2024-11-11T16:26:42Z) - Grow Your Limits: Continuous Improvement with Real-World RL for Robotic
Locomotion [66.69666636971922]
We present APRL, a policy regularization framework that modulates the robot's exploration over the course of training.
APRL enables a quadrupedal robot to efficiently learn to walk entirely in the real world within minutes.
arXiv Detail & Related papers (2023-10-26T17:51:46Z) - DTC: Deep Tracking Control [16.2850135844455]
We propose a hybrid control architecture that combines the advantages of both worlds to achieve greater robustness, foot-placement accuracy, and terrain generalization.
A deep neural network policy is trained in simulation, aiming to track the optimized footholds.
We demonstrate superior robustness in the presence of slippery or deformable ground when compared to model-based counterparts.
arXiv Detail & Related papers (2023-09-27T07:57:37Z) - Combining model-predictive control and predictive reinforcement learning
for stable quadrupedal robot locomotion [0.0]
We study how this can be achieved by a combination of model-predictive and predictive reinforcement learning controllers.
In this work, we combine both control methods to address the quadrupedal robot stable gate generation problem.
arXiv Detail & Related papers (2023-07-15T09:22:37Z) - Decision S4: Efficient Sequence-Based RL via State Spaces Layers [87.3063565438089]
We present an off-policy training procedure that works with trajectories, while still maintaining the training efficiency of the S4 model.
An on-policy training procedure that is trained in a recurrent manner, benefits from long-range dependencies, and is based on a novel stable actor-critic mechanism.
arXiv Detail & Related papers (2023-06-08T13:03:53Z) - Continuous Trajectory Generation Based on Two-Stage GAN [50.55181727145379]
We propose a novel two-stage generative adversarial framework to generate the continuous trajectory on the road network.
Specifically, we build the generator under the human mobility hypothesis of the A* algorithm to learn the human mobility behavior.
For the discriminator, we combine the sequential reward with the mobility yaw reward to enhance the effectiveness of the generator.
arXiv Detail & Related papers (2023-01-16T09:54:02Z) - Learning to Exploit Elastic Actuators for Quadruped Locomotion [7.9585932082270014]
Spring-based actuators in legged locomotion provide energy-efficiency and improved performance, but increase the difficulty of controller design.
We propose to learn model-free controllers directly on the real robot.
We evaluate the proposed approach on the DLR elastic quadruped bert.
arXiv Detail & Related papers (2022-09-15T09:43:17Z) - Training and Evaluation of Deep Policies using Reinforcement Learning
and Generative Models [67.78935378952146]
GenRL is a framework for solving sequential decision-making problems.
It exploits the combination of reinforcement learning and latent variable generative models.
We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training.
arXiv Detail & Related papers (2022-04-18T22:02:32Z) - RLOC: Terrain-Aware Legged Locomotion using Reinforcement Learning and
Optimal Control [6.669503016190925]
We present a unified model-based and data-driven approach for quadrupedal planning and control.
We map sensory information and desired base velocity commands into footstep plans using a reinforcement learning policy.
We train and evaluate our framework on a complex quadrupedal system, ANYmal B, and demonstrate transferability to a larger and heavier robot, ANYmal C, without requiring retraining.
arXiv Detail & Related papers (2020-12-05T18:30:23Z) - Continuous Transition: Improving Sample Efficiency for Continuous
Control Problems via MixUp [119.69304125647785]
This paper introduces a concise yet powerful method to construct Continuous Transition.
Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions.
To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically.
arXiv Detail & Related papers (2020-11-30T01:20:23Z) - ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for
Mobile Manipulation [99.2543521972137]
ReLMoGen is a framework that combines a learned policy to predict subgoals and a motion generator to plan and execute the motion needed to reach these subgoals.
Our method is benchmarked on a diverse set of seven robotics tasks in photo-realistic simulation environments.
ReLMoGen shows outstanding transferability between different motion generators at test time, indicating a great potential to transfer to real robots.
arXiv Detail & Related papers (2020-08-18T08:05:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.