H-GAP: Humanoid Control with a Generalist Planner
- URL: http://arxiv.org/abs/2312.02682v1
- Date: Tue, 5 Dec 2023 11:40:24 GMT
- Title: H-GAP: Humanoid Control with a Generalist Planner
- Authors: Zhengyao Jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael
Janner, Edward Grefenstette, Tim Rockt\"aschel, Yuandong Tian
- Abstract summary: Humanoid Generalist Autoencoding Planner (H-GAP) is a generative model trained on humanoid trajectories derived from human motioncaptured data.
For 56 degrees of freedom humanoid, we empirically demonstrate that H-GAP learns to represent and generate a wide range of motor behaviours.
We also do a series of empirical studies on the scaling properties of H-GAP, showing the potential for performance gains via additional data but not computing.
- Score: 45.50995825122686
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humanoid control is an important research challenge offering avenues for
integration into human-centric infrastructures and enabling physics-driven
humanoid animations. The daunting challenges in this field stem from the
difficulty of optimizing in high-dimensional action spaces and the instability
introduced by the bipedal morphology of humanoids. However, the extensive
collection of human motion-captured data and the derived datasets of humanoid
trajectories, such as MoCapAct, paves the way to tackle these challenges. In
this context, we present Humanoid Generalist Autoencoding Planner (H-GAP), a
state-action trajectory generative model trained on humanoid trajectories
derived from human motion-captured data, capable of adeptly handling downstream
control tasks with Model Predictive Control (MPC). For 56 degrees of freedom
humanoid, we empirically demonstrate that H-GAP learns to represent and
generate a wide range of motor behaviours. Further, without any learning from
online interactions, it can also flexibly transfer these behaviors to solve
novel downstream control tasks via planning. Notably, H-GAP excels established
MPC baselines that have access to the ground truth dynamics model, and is
superior or comparable to offline RL methods trained for individual tasks.
Finally, we do a series of empirical studies on the scaling properties of
H-GAP, showing the potential for performance gains via additional data but not
computing. Code and videos are available at
https://ycxuyingchen.github.io/hgap/.
Related papers
- SPIDER: Scalable Physics-Informed Dexterous Retargeting [45.24491726503442]
Learning dexterous and agile policy for humanoid and dexterous hand control requires large-scale demonstrations.<n>Human motion data is available from motion capture, videos, and virtual reality, which could help address the data scarcity problem.<n>We propose SPIDER, a physics-based framework to transform human demonstrations to dynamically augment robot trajectories.
arXiv Detail & Related papers (2025-11-12T16:54:00Z) - SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control [85.91101551600978]
We show that scaling up model capacity, data, and compute yields a generalist humanoid controller capable of creating natural and robust whole-body movements.<n>We build a foundation model for motion tracking by scaling along three axes: network size, dataset volume, and compute.<n>We show the practical utility of our model through two mechanisms: (1) a real-time universal kinematic planner that bridges motion tracking to downstream task execution, enabling natural and interactive control, and (2) a unified token space that supports various motion input interfaces.
arXiv Detail & Related papers (2025-11-11T04:37:40Z) - ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning [59.64325421657381]
Humanoid whole-body loco-manipulation promises transformative capabilities for daily service and warehouse tasks.<n>We introduce ResMimic, a two-stage residual learning framework for precise and expressive humanoid control from human motion data.<n>Results show substantial gains in task success, training efficiency, and robustness over strong baselines.
arXiv Detail & Related papers (2025-10-06T17:47:02Z) - OmniRetarget: Interaction-Preserving Data Generation for Humanoid Whole-Body Loco-Manipulation and Scene Interaction [76.44108003274955]
A dominant paradigm for teaching humanoid robots complex skills is to retarget human motions as kinematic references to train reinforcement learning policies.<n>We introduce OmniRetarget, an interaction-preserving data generation engine based on an interaction mesh.<n>By minimizing the Laplacian deformation between the human and robot meshes, OmniRetarget generates kinematically feasible trajectories.
arXiv Detail & Related papers (2025-09-30T17:59:02Z) - TrajBooster: Boosting Humanoid Whole-Body Manipulation via Trajectory-Centric Learning [79.59753528758361]
We present TrajBooster, a cross-embodiment framework that leverages abundant wheeled-humanoid data to boost bipedal VLA.<n>Our key idea is to use end-effector trajectories as a morphology-agnostic interface.<n>Results show that TrajBooster allows existing wheeled-humanoid data to efficiently strengthen bipedal humanoid VLA performance.
arXiv Detail & Related papers (2025-09-15T12:25:39Z) - Graph RAG as Human Choice Model: Building a Data-Driven Mobility Agent with Preference Chain [4.675541221895496]
Recent advances in generative agents, powered by Large Language Models (LLMs), have shown promise in simulating human behaviors without relying on extensive datasets.<n>This paper introduces the Preference Chain, a novel method that integrates Graph Retrieval-Augmented Generation (RAG) with LLMs to enhance context-aware simulation of human behavior in transportation systems.
arXiv Detail & Related papers (2025-08-22T07:50:57Z) - Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos [66.62109400603394]
We introduce Being-H0, a dexterous Vision-Language-Action model trained on large-scale human videos.<n>Our approach centers on physical instruction tuning, a novel training paradigm that combines large-scale VLA pretraining from human videos, physical space alignment for 3D reasoning, and post-training adaptation for robotic tasks.<n>We empirically show the excellence of Being-H0 in hand motion generation and instruction following, and it also scales well with model and data sizes.
arXiv Detail & Related papers (2025-07-21T13:19:09Z) - Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models [71.34520793462069]
Unsupervised reinforcement learning (RL) aims at pre-training agents that can solve a wide range of downstream tasks in complex environments.
We introduce a novel algorithm regularizing unsupervised RL towards imitating trajectories from unlabeled behavior datasets.
We demonstrate the effectiveness of this new approach in a challenging humanoid control problem.
arXiv Detail & Related papers (2025-04-15T10:41:11Z) - Exploring Disentangled and Controllable Human Image Synthesis: From End-to-End to Stage-by-Stage [34.72900198337818]
We introduce a new disentangled and controllable human synthesis task.
We first develop an end-to-end generative model trained on MVHumanNet for factor disentanglement.
We propose a stage-by-stage framework that decomposes human image generation into three sequential steps.
arXiv Detail & Related papers (2025-03-25T09:23:20Z) - The Role of Domain Randomization in Training Diffusion Policies for Whole-Body Humanoid Control [14.36344580057985]
Policies Diffusion (DPs) have shown impressive results in robotic manipulation.
This paper investigates how dataset diversity and size affect the performance of DPs for humanoid whole-body control.
arXiv Detail & Related papers (2024-11-02T19:33:28Z) - ImDy: Human Inverse Dynamics from Imitated Observations [47.994797555884325]
Inverse dynamics (ID) aims at reproducing the driven torques from human kinematic observations.
Conventional optimization-based ID requires expensive laboratory setups, restricting its availability.
We propose to exploit the recently progressive human motion imitation algorithms to learn human inverse dynamics in a data-driven manner.
arXiv Detail & Related papers (2024-10-23T07:06:08Z) - HINT: Learning Complete Human Neural Representations from Limited Viewpoints [69.76947323932107]
We propose a NeRF-based algorithm able to learn a detailed and complete human model from limited viewing angles.
As a result, our method can reconstruct complete humans even from a few viewing angles, increasing performance by more than 15% PSNR.
arXiv Detail & Related papers (2024-05-30T05:43:09Z) - Humanoid Locomotion as Next Token Prediction [84.21335675130021]
Our model is a causal transformer trained via autoregressive prediction of sensorimotor trajectories.
We show that our model enables a full-sized humanoid to walk in San Francisco zero-shot.
Our model can transfer to the real world even when trained on only 27 hours of walking data, and can generalize commands not seen during training like walking backward.
arXiv Detail & Related papers (2024-02-29T18:57:37Z) - Learning Human Action Recognition Representations Without Real Humans [66.61527869763819]
We present a benchmark that leverages real-world videos with humans removed and synthetic data containing virtual humans to pre-train a model.
We then evaluate the transferability of the representation learned on this data to a diverse set of downstream action recognition benchmarks.
Our approach outperforms previous baselines by up to 5%.
arXiv Detail & Related papers (2023-11-10T18:38:14Z) - Enhanced Human-Robot Collaboration using Constrained Probabilistic
Human-Motion Prediction [5.501477817904299]
We propose a novel human motion prediction framework that incorporates human joint constraints and scene constraints.
It is tested on a human arm kinematic model and implemented on a human-robot collaborative setup with a UR5 robot arm.
arXiv Detail & Related papers (2023-10-05T05:12:14Z) - Model Predictive Control for Fluid Human-to-Robot Handovers [50.72520769938633]
Planning motions that take human comfort into account is not a part of the human-robot handover process.
We propose to generate smooth motions via an efficient model-predictive control framework.
We conduct human-to-robot handover experiments on a diverse set of objects with several users.
arXiv Detail & Related papers (2022-03-31T23:08:20Z) - Hierarchical Graph-Convolutional Variational AutoEncoding for Generative
Modelling of Human Motion [1.2599533416395767]
Models of human motion commonly focus either on trajectory prediction or action classification but rarely both.
Here we propose a novel architecture based on hierarchical variational autoencoders and deep graph convolutional neural networks for generating a holistic model of action over multiple time-scales.
We show this Hierarchical Graph-conational Varivolutional Autoencoder (HG-VAE) to be capable of generating coherent actions, detecting out-of-distribution data, and imputing missing data by gradient ascent on the model's posterior.
arXiv Detail & Related papers (2021-11-24T16:21:07Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - Residual Force Control for Agile Human Behavior Imitation and Extended
Motion Synthesis [32.22704734791378]
Reinforcement learning has shown great promise for realistic human behaviors by learning humanoid control policies from motion capture data.
It is still very challenging to reproduce sophisticated human skills like ballet dance, or to stably imitate long-term human behaviors with complex transitions.
We propose a novel approach, residual force control (RFC), that augments a humanoid control policy by adding external residual forces into the action space.
arXiv Detail & Related papers (2020-06-12T17:56:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.