Hierarchical World Models as Visual Whole-Body Humanoid Controllers
- URL: http://arxiv.org/abs/2405.18418v2
- Date: Fri, 31 May 2024 17:03:00 GMT
- Title: Hierarchical World Models as Visual Whole-Body Humanoid Controllers
- Authors: Nicklas Hansen, Jyothir S V, Vlad Sobal, Yann LeCun, Xiaolong Wang, Hao Su,
- Abstract summary: Whole-body control for humanoids is challenging due to the high-dimensional nature of the problem, and the inherent instability of a bipedal morphology.
We propose a hierarchical world model in which a high-level agent generates commands based on visual observations for a low-level agent to execute, both of which are trained with rewards.
Our approach produces highly performant control policies in 8 tasks with a simulated 56-DoF humanoid, while synthesizing motions that are broadly preferred by humans.
- Score: 44.67878600054106
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Whole-body control for humanoids is challenging due to the high-dimensional nature of the problem, coupled with the inherent instability of a bipedal morphology. Learning from visual observations further exacerbates this difficulty. In this work, we explore highly data-driven approaches to visual whole-body humanoid control based on reinforcement learning, without any simplifying assumptions, reward design, or skill primitives. Specifically, we propose a hierarchical world model in which a high-level agent generates commands based on visual observations for a low-level agent to execute, both of which are trained with rewards. Our approach produces highly performant control policies in 8 tasks with a simulated 56-DoF humanoid, while synthesizing motions that are broadly preferred by humans. Code and videos: https://nicklashansen.com/rlpuppeteer
Related papers
- Aligning Machine and Human Visual Representations across Abstraction Levels [42.86478924838503]
Deep neural networks have achieved success across a wide range of applications, including as models of human behavior in vision tasks.
However, neural network training and human learning differ in fundamental ways, and neural networks often fail to generalize as robustly as humans do.
We highlight a key misalignment between vision models and humans: whereas human conceptual knowledge is hierarchically organized from fine- to coarse-scale distinctions, model representations do not accurately capture all these levels of abstraction.
To address this misalignment, we first train a teacher model to imitate human judgments, then transfer human-like structure from its representations into pretrained state-of-the
arXiv Detail & Related papers (2024-09-10T13:41:08Z) - Advancing Humanoid Locomotion: Mastering Challenging Terrains with Denoising World Model Learning [11.648198428063415]
We introduce Denoising World Model Learning (DWL), an end-to-end reinforcement learning framework for humanoid locomotion control.
DWL demonstrates the world's first humanoid robot to master real-world challenging terrains such as snowy and inclined land in the wild, up and down stairs, and extremely uneven terrains.
arXiv Detail & Related papers (2024-08-26T17:59:03Z) - Dreamitate: Real-World Visuomotor Policy Learning via Video Generation [49.03287909942888]
We propose a visuomotor policy learning framework that fine-tunes a video diffusion model on human demonstrations of a given task.
We generate an example of an execution of the task conditioned on images of a novel scene, and use this synthesized execution directly to control the robot.
arXiv Detail & Related papers (2024-06-24T17:59:45Z) - HumanPlus: Humanoid Shadowing and Imitation from Humans [82.47551890765202]
We introduce a full-stack system for humanoids to learn motion and autonomous skills from human data.
We first train a low-level policy in simulation via reinforcement learning using existing 40-hour human motion datasets.
We then perform supervised behavior cloning to train skill policies using egocentric vision, allowing humanoids to complete different tasks autonomously.
arXiv Detail & Related papers (2024-06-15T00:41:34Z) - HINT: Learning Complete Human Neural Representations from Limited Viewpoints [69.76947323932107]
We propose a NeRF-based algorithm able to learn a detailed and complete human model from limited viewing angles.
As a result, our method can reconstruct complete humans even from a few viewing angles, increasing performance by more than 15% PSNR.
arXiv Detail & Related papers (2024-05-30T05:43:09Z) - H-GAP: Humanoid Control with a Generalist Planner [45.50995825122686]
Humanoid Generalist Autoencoding Planner (H-GAP) is a generative model trained on humanoid trajectories derived from human motioncaptured data.
For 56 degrees of freedom humanoid, we empirically demonstrate that H-GAP learns to represent and generate a wide range of motor behaviours.
We also do a series of empirical studies on the scaling properties of H-GAP, showing the potential for performance gains via additional data but not computing.
arXiv Detail & Related papers (2023-12-05T11:40:24Z) - Generative Adversarial Neuroevolution for Control Behaviour Imitation [3.04585143845864]
We propose to explore whether deep neuroevolution can be used for behaviour imitation on popular simulation environments.
We introduce a simple co-evolutionary adversarial generation framework, and evaluate its capabilities by evolving standard deep recurrent networks.
Across all tasks, we find the final elite actor agents capable of achieving scores as high as those obtained by the pre-trained agents.
arXiv Detail & Related papers (2023-04-03T16:33:22Z) - Real-World Humanoid Locomotion with Reinforcement Learning [92.85934954371099]
We present a fully learning-based approach for real-world humanoid locomotion.
Our controller can walk over various outdoor terrains, is robust to external disturbances, and can adapt in context.
arXiv Detail & Related papers (2023-03-06T18:59:09Z) - On the Emergence of Whole-body Strategies from Humanoid Robot
Push-recovery Learning [32.070068456106895]
We apply model-free Deep Reinforcement Learning for training a general and robust humanoid push-recovery policy in a simulation environment.
Our method targets high-dimensional whole-body humanoid control and is validated on the iCub humanoid.
arXiv Detail & Related papers (2021-04-29T17:49:20Z) - Visual Navigation Among Humans with Optimal Control as a Supervisor [72.5188978268463]
We propose an approach that combines learning-based perception with model-based optimal control to navigate among humans.
Our approach is enabled by our novel data-generation tool, HumANav.
We demonstrate that the learned navigation policies can anticipate and react to humans without explicitly predicting future human motion.
arXiv Detail & Related papers (2020-03-20T16:13:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.