Related papers: HumanPlus: Humanoid Shadowing and Imitation from Humans

HumanPlus: Humanoid Shadowing and Imitation from Humans

URL: http://arxiv.org/abs/2406.10454v1
Date: Sat, 15 Jun 2024 00:41:34 GMT
Title: HumanPlus: Humanoid Shadowing and Imitation from Humans
Authors: Zipeng Fu, Qingqing Zhao, Qi Wu, Gordon Wetzstein, Chelsea Finn,
Abstract summary: We introduce a full-stack system for humanoids to learn motion and autonomous skills from human data. We first train a low-level policy in simulation via reinforcement learning using existing 40-hour human motion datasets. We then perform supervised behavior cloning to train skill policies using egocentric vision, allowing humanoids to complete different tasks autonomously.
Score: 82.47551890765202
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One of the key arguments for building robots that have similar form factors to human beings is that we can leverage the massive human data for training. Yet, doing so has remained challenging in practice due to the complexities in humanoid perception and control, lingering physical gaps between humanoids and humans in morphologies and actuation, and lack of a data pipeline for humanoids to learn autonomous skills from egocentric vision. In this paper, we introduce a full-stack system for humanoids to learn motion and autonomous skills from human data. We first train a low-level policy in simulation via reinforcement learning using existing 40-hour human motion datasets. This policy transfers to the real world and allows humanoid robots to follow human body and hand motion in real time using only a RGB camera, i.e. shadowing. Through shadowing, human operators can teleoperate humanoids to collect whole-body data for learning different tasks in the real world. Using the data collected, we then perform supervised behavior cloning to train skill policies using egocentric vision, allowing humanoids to complete different tasks autonomously by imitating human skills. We demonstrate the system on our customized 33-DoF 180cm humanoid, autonomously completing tasks such as wearing a shoe to stand up and walk, unloading objects from warehouse racks, folding a sweatshirt, rearranging objects, typing, and greeting another robot with 60-100% success rates using up to 40 demonstrations. Project website: https://humanoid-ai.github.io/

Related papers

Humanoid Policy ~ Human Policy [26.01581047414598]
We train a human-humanoid behavior policy, which we term Human Action Transformer (HAT) The state-action space of HAT is unified for both humans and humanoid robots and can be differentiably retargeted to robot actions. We show that human data improves both generalization and robustness of HAT with significantly better data collection efficiency.
arXiv Detail & Related papers (2025-03-17T17:59:09Z)
Learning from Massive Human Videos for Universal Humanoid Pose Control [46.417054298537195]
This paper introduces Humanoid-X, a large-scale dataset of over 20 million humanoid robot poses with corresponding text-based motion descriptions. We train a large humanoid model, UH-1, which takes text instructions as input and outputs corresponding actions to control a humanoid robot. Our scalable training approach leads to superior generalization in text-based humanoid control, marking a significant step toward adaptable, real-world-ready humanoid robots.
arXiv Detail & Related papers (2024-12-18T18:59:56Z)
Generalizable Humanoid Manipulation with 3D Diffusion Policies [41.23383596258797]
We build a real-world robotic system to address the problem of autonomous manipulation by humanoid robots. Our system is mainly an integration of 1) a whole-upper-body robotic teleoperation system to acquire human-like robot data, and 2) a 25-DoF humanoid robot platform with a height-adjustable cart and a 3D LiDAR sensor. We show that using only data collected in one scene and with only onboard computing, a full-sized humanoid robot can autonomously perform skills in diverse real-world scenarios.
arXiv Detail & Related papers (2024-10-14T17:59:00Z)
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and Manipulation [50.616995671367704]
We present a high-dimensional, simulated robot learning benchmark, HumanoidBench, featuring a humanoid robot equipped with dexterous hands. Our findings reveal that state-of-the-art reinforcement learning algorithms struggle with most tasks, whereas a hierarchical learning approach achieves superior performance when supported by robust low-level policies.
arXiv Detail & Related papers (2024-03-15T17:45:44Z)
Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation [34.65637397405485]
We present Human to Humanoid (H2O), a framework that enables real-time whole-body teleoperation of a humanoid robot with only an RGB camera. We train a robust real-time humanoid motion imitator in simulation using these refined motions and transfer it to the real humanoid robot in a zero-shot manner. To the best of our knowledge, this is the first demonstration to achieve learning-based real-time whole-body humanoid teleoperation.
arXiv Detail & Related papers (2024-03-07T12:10:41Z)
Expressive Whole-Body Control for Humanoid Robots [20.132927075816742]
We learn a whole-body control policy on a human-sized robot to mimic human motions as realistic as possible. With training in simulation and Sim2Real transfer, our policy can control a humanoid robot to walk in different styles, shake hands with humans, and even dance with a human in the real world.
arXiv Detail & Related papers (2024-02-26T18:09:24Z)
Structured World Models from Human Videos [45.08503470821952]
We tackle the problem of learning complex, general behaviors directly in the real world. We propose an approach for robots to efficiently learn manipulation skills using only a handful of real-world interaction trajectories.
arXiv Detail & Related papers (2023-08-21T17:59:32Z)
Giving Robots a Hand: Learning Generalizable Manipulation with Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation. Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation. In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z)
SACSoN: Scalable Autonomous Control for Social Navigation [62.59274275261392]
We develop methods for training policies for socially unobtrusive navigation. By minimizing this counterfactual perturbation, we can induce robots to behave in ways that do not alter the natural behavior of humans in the shared space. We collect a large dataset where an indoor mobile robot interacts with human bystanders.
arXiv Detail & Related papers (2023-06-02T19:07:52Z)
HERD: Continuous Human-to-Robot Evolution for Learning from Human Demonstration [57.045140028275036]
We show that manipulation skills can be transferred from a human to a robot through the use of micro-evolutionary reinforcement learning. We propose an algorithm for multi-dimensional evolution path searching that allows joint optimization of both the robot evolution path and the policy.
arXiv Detail & Related papers (2022-12-08T15:56:13Z)
Human Grasp Classification for Reactive Human-to-Robot Handovers [50.91803283297065]
We propose an approach for human-to-robot handovers in which the robot meets the human halfway. We collect a human grasp dataset which covers typical ways of holding objects with various hand shapes and poses. We present a planning and execution approach that takes the object from the human hand according to the detected grasp and hand position.
arXiv Detail & Related papers (2020-03-12T19:58:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.