Guided Curriculum Learning for Walking Over Complex Terrain
- URL: http://arxiv.org/abs/2010.03848v2
- Date: Mon, 1 Feb 2021 19:36:06 GMT
- Title: Guided Curriculum Learning for Walking Over Complex Terrain
- Authors: Brendan Tidd, Nicolas Hudson, Akansel Cosgun
- Abstract summary: We propose a 3-stage curriculum to train Deep Reinforcement Learning policies for bipedal walking over various challenging terrains.
In simulation experiments, we show that our approach is effective in learning walking policies, separate from each other, for five terrain types.
- Score: 2.4192504570921622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reliable bipedal walking over complex terrain is a challenging problem, using
a curriculum can help learning. Curriculum learning is the idea of starting
with an achievable version of a task and increasing the difficulty as a success
criteria is met. We propose a 3-stage curriculum to train Deep Reinforcement
Learning policies for bipedal walking over various challenging terrains. In the
first stage, the agent starts on an easy terrain and the terrain difficulty is
gradually increased, while forces derived from a target policy are applied to
the robot joints and the base. In the second stage, the guiding forces are
gradually reduced to zero. Finally, in the third stage, random perturbations
with increasing magnitude are applied to the robot base, so the robustness of
the policies are improved. In simulation experiments, we show that our approach
is effective in learning walking policies, separate from each other, for five
terrain types: flat, hurdles, gaps, stairs, and steps. Moreover, we demonstrate
that in the absence of human demonstrations, a simple hand designed walking
trajectory is a sufficient prior to learn to traverse complex terrain types. In
ablation studies, we show that taking out any one of the three stages of the
curriculum degrades the learning performance.
Related papers
- Learning Humanoid Locomotion over Challenging Terrain [84.35038297708485]
We present a learning-based approach for blind humanoid locomotion capable of traversing challenging natural and man-made terrains.
Our model is first pre-trained on a dataset of flat-ground trajectories with sequence modeling, and then fine-tuned on uneven terrain using reinforcement learning.
We evaluate our model on a real humanoid robot across a variety of terrains, including rough, deformable, and sloped surfaces.
arXiv Detail & Related papers (2024-10-04T17:57:09Z) - Single-Shot Learning of Stable Dynamical Systems for Long-Horizon Manipulation Tasks [48.54757719504994]
This paper focuses on improving task success rates while reducing the amount of training data needed.
Our approach introduces a novel method that segments long-horizon demonstrations into discrete steps defined by waypoints and subgoals.
We validate our approach through both simulation and real-world experiments, demonstrating effective transfer from simulation to physical robotic platforms.
arXiv Detail & Related papers (2024-10-01T19:49:56Z) - Learning Bipedal Walking for Humanoid Robots in Challenging Environments with Obstacle Avoidance [0.3481985817302898]
Deep reinforcement learning has seen successful implementations on humanoid robots to achieve dynamic walking.
In this paper, we aim to achieve bipedal locomotion in an environment where obstacles are present using a policy-based reinforcement learning.
arXiv Detail & Related papers (2024-09-25T07:02:04Z) - Imitation Is Not Enough: Robustifying Imitation with Reinforcement
Learning for Challenging Driving Scenarios [147.16925581385576]
We show how imitation learning combined with reinforcement learning can substantially improve the safety and reliability of driving policies.
We train a policy on over 100k miles of urban driving data, and measure its effectiveness in test scenarios grouped by different levels of collision likelihood.
arXiv Detail & Related papers (2022-12-21T23:59:33Z) - You Only Live Once: Single-Life Reinforcement Learning [124.1738675154651]
In many real-world situations, the goal might not be to learn a policy that can do the task repeatedly, but simply to perform a new task successfully once in a single trial.
We formalize this problem setting, where an agent must complete a task within a single episode without interventions.
We propose an algorithm, $Q$-weighted adversarial learning (QWALE), which employs a distribution matching strategy.
arXiv Detail & Related papers (2022-10-17T09:00:11Z) - Vision-Based Mobile Robotics Obstacle Avoidance With Deep Reinforcement
Learning [49.04274612323564]
Obstacle avoidance is a fundamental and challenging problem for autonomous navigation of mobile robots.
In this paper, we consider the problem of obstacle avoidance in simple 3D environments where the robot has to solely rely on a single monocular camera.
We tackle the obstacle avoidance problem as a data-driven end-to-end deep learning approach.
arXiv Detail & Related papers (2021-03-08T13:05:46Z) - ALLSTEPS: Curriculum-driven Learning of Stepping Stone Skills [8.406171678292964]
Finding good solutions to stepping-stone locomotion is a longstanding and fundamental challenge for animation and robotics.
We present fully learned solutions to this difficult problem using reinforcement learning.
Results are presented for a simulated human character, a realistic bipedal robot simulation and a monster character, in each case producing robust, plausible motions.
arXiv Detail & Related papers (2020-05-09T00:16:38Z) - Obstacle Tower Without Human Demonstrations: How Far a Deep Feed-Forward
Network Goes with Reinforcement Learning [1.699937048243873]
The Obstacle Tower Challenge is the task to master a procedurally generated chain of levels that subsequently get harder to complete.
We present an approach that performed competitively (placed 7th) but starts completely from scratch by means of Deep Reinforcement Learning.
arXiv Detail & Related papers (2020-04-01T16:55:51Z) - Learning to Generalize Across Long-Horizon Tasks from Human
Demonstrations [52.696205074092006]
Generalization Through Imitation (GTI) is a two-stage offline imitation learning algorithm.
GTI exploits a structure where demonstrated trajectories for different tasks intersect at common regions of the state space.
In the first stage of GTI, we train a policy that leverages intersections to have the capacity to compose behaviors from different demonstration trajectories together.
In the second stage of GTI, we train a goal-directed agent to generalize to novel start and goal configurations.
arXiv Detail & Related papers (2020-03-13T02:25:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.