Related papers: Reinforcement Learning with Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion

Reinforcement Learning with Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion

URL: http://arxiv.org/abs/2109.06409v2
Date: Thu, 16 Sep 2021 12:42:50 GMT
Title: Reinforcement Learning with Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion
Authors: Haojie Shi, Bo Zhou, Hongsheng Zeng, Fan Wang, Yueqiang Dong, Jiangyong Li, Kang Wang, Hao Tian, Max Q.-H. Meng
Abstract summary: We propose a novel RL-based approach that contains an evolutionary foot trajectory generator. The generator continually optimize the shape of the output trajectory for the given task, providing diversified motion priors to guide the policy learning. We deploy the controller learned in the simulation on a 12-DoF quadrupedal robot, and it can successfully traverse challenging scenarios with efficient gaits.
Score: 29.853927354893656
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently reinforcement learning (RL) has emerged as a promising approach for quadrupedal locomotion, which can save the manual effort in conventional approaches such as designing skill-specific controllers. However, due to the complex nonlinear dynamics in quadrupedal robots and reward sparsity, it is still difficult for RL to learn effective gaits from scratch, especially in challenging tasks such as walking over the balance beam. To alleviate such difficulty, we propose a novel RL-based approach that contains an evolutionary foot trajectory generator. Unlike prior methods that use a fixed trajectory generator, the generator continually optimizes the shape of the output trajectory for the given task, providing diversified motion priors to guide the policy learning. The policy is trained with reinforcement learning to output residual control signals that fit different gaits. We then optimize the trajectory generator and policy network alternatively to stabilize the training and share the exploratory data to improve sample efficiency. As a result, our approach can solve a range of challenging tasks in simulation by learning from scratch, including walking on a balance beam and crawling through the cave. To further verify the effectiveness of our approach, we deploy the controller learned in the simulation on a 12-DoF quadrupedal robot, and it can successfully traverse challenging scenarios with efficient gaits.

Related papers

Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks. We introduce a generative framework leveraging flow matching for online robot dynamics model alignment. We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z)
Training Directional Locomotion for Quadrupedal Low-Cost Robotic Systems via Deep Reinforcement Learning [4.669957449088593]
We present Deep Reinforcement Learning training of directional locomotion for low-cost quadpedalru robots in the real world. We exploit randomization of heading that the robot must follow to foster exploration of action-state transitions. Changing the heading in episode resets to current yaw plus a random value drawn from a normal distribution yields policies able to follow complex trajectories.
arXiv Detail & Related papers (2025-03-14T03:53:01Z)
Multi-Objective Algorithms for Learning Open-Ended Robotic Problems [1.0124625066746598]
Quadrupedal locomotion is a complex, open-ended problem vital to expanding autonomous vehicle reach. Traditional reinforcement learning approaches often fall short due to training instability and sample inefficiency. We propose a novel method leveraging multi-objective evolutionary algorithms as an automatic curriculum learning mechanism.
arXiv Detail & Related papers (2024-11-11T16:26:42Z)
Grow Your Limits: Continuous Improvement with Real-World RL for Robotic Locomotion [66.69666636971922]
We present APRL, a policy regularization framework that modulates the robot's exploration over the course of training. APRL enables a quadrupedal robot to efficiently learn to walk entirely in the real world within minutes.
arXiv Detail & Related papers (2023-10-26T17:51:46Z)
DTC: Deep Tracking Control [16.2850135844455]
We propose a hybrid control architecture that combines the advantages of both worlds to achieve greater robustness, foot-placement accuracy, and terrain generalization. A deep neural network policy is trained in simulation, aiming to track the optimized footholds. We demonstrate superior robustness in the presence of slippery or deformable ground when compared to model-based counterparts.
arXiv Detail & Related papers (2023-09-27T07:57:37Z)
Combining model-predictive control and predictive reinforcement learning for stable quadrupedal robot locomotion [0.0]
We study how this can be achieved by a combination of model-predictive and predictive reinforcement learning controllers. In this work, we combine both control methods to address the quadrupedal robot stable gate generation problem.
arXiv Detail & Related papers (2023-07-15T09:22:37Z)
Decision S4: Efficient Sequence-Based RL via State Spaces Layers [87.3063565438089]
We present an off-policy training procedure that works with trajectories, while still maintaining the training efficiency of the S4 model. An on-policy training procedure that is trained in a recurrent manner, benefits from long-range dependencies, and is based on a novel stable actor-critic mechanism.
arXiv Detail & Related papers (2023-06-08T13:03:53Z)
Continuous Trajectory Generation Based on Two-Stage GAN [50.55181727145379]
We propose a novel two-stage generative adversarial framework to generate the continuous trajectory on the road network. Specifically, we build the generator under the human mobility hypothesis of the A* algorithm to learn the human mobility behavior. For the discriminator, we combine the sequential reward with the mobility yaw reward to enhance the effectiveness of the generator.
arXiv Detail & Related papers (2023-01-16T09:54:02Z)
Learning to Exploit Elastic Actuators for Quadruped Locomotion [7.9585932082270014]
Spring-based actuators in legged locomotion provide energy-efficiency and improved performance, but increase the difficulty of controller design. We propose to learn model-free controllers directly on the real robot. We evaluate the proposed approach on the DLR elastic quadruped bert.
arXiv Detail & Related papers (2022-09-15T09:43:17Z)
Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models [67.78935378952146]
GenRL is a framework for solving sequential decision-making problems. It exploits the combination of reinforcement learning and latent variable generative models. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training.
arXiv Detail & Related papers (2022-04-18T22:02:32Z)
RLOC: Terrain-Aware Legged Locomotion using Reinforcement Learning and Optimal Control [6.669503016190925]
We present a unified model-based and data-driven approach for quadrupedal planning and control. We map sensory information and desired base velocity commands into footstep plans using a reinforcement learning policy. We train and evaluate our framework on a complex quadrupedal system, ANYmal B, and demonstrate transferability to a larger and heavier robot, ANYmal C, without requiring retraining.
arXiv Detail & Related papers (2020-12-05T18:30:23Z)
Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp [119.69304125647785]
This paper introduces a concise yet powerful method to construct Continuous Transition. Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions. To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically.
arXiv Detail & Related papers (2020-11-30T01:20:23Z)
ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation [99.2543521972137]
ReLMoGen is a framework that combines a learned policy to predict subgoals and a motion generator to plan and execute the motion needed to reach these subgoals. Our method is benchmarked on a diverse set of seven robotics tasks in photo-realistic simulation environments. ReLMoGen shows outstanding transferability between different motion generators at test time, indicating a great potential to transfer to real robots.
arXiv Detail & Related papers (2020-08-18T08:05:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.