Towards Embodiment Scaling Laws in Robot Locomotion
- URL: http://arxiv.org/abs/2505.05753v1
- Date: Fri, 09 May 2025 03:25:43 GMT
- Title: Towards Embodiment Scaling Laws in Robot Locomotion
- Authors: Bo Ai, Liu Dai, Nico Bohlinger, Dichen Li, Tongzhou Mu, Zhanxin Wu, K. Fay, Henrik I. Christensen, Jan Peters, Hao Su,
- Abstract summary: We investigate scaling artificial handling laws across multiple embodiments.<n>We find that increasing the number of embodiments improves generalization to unseen ones.<n>Results represent a step toward general embodied robotics with potential relevance to adaptive control.
- Score: 36.86431442666063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing generalist agents that can operate across diverse tasks, environments, and physical embodiments is a grand challenge in robotics and artificial intelligence. In this work, we focus on the axis of embodiment and investigate embodiment scaling laws$\unicode{x2013}$the hypothesis that increasing the number of training embodiments improves generalization to unseen ones. Using robot locomotion as a test bed, we procedurally generate a dataset of $\sim$1,000 varied embodiments, spanning humanoids, quadrupeds, and hexapods, and train generalist policies capable of handling diverse observation and action spaces on random subsets. We find that increasing the number of training embodiments improves generalization to unseen ones, and scaling embodiments is more effective in enabling embodiment-level generalization than scaling data on small, fixed sets of embodiments. Notably, our best policy, trained on the full dataset, zero-shot transfers to novel embodiments in the real world, such as Unitree Go2 and H1. These results represent a step toward general embodied intelligence, with potential relevance to adaptive control for configurable robots, co-design of morphology and control, and beyond.
Related papers
- Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons [69.87766750714945]
General-purpose robot reward models are typically trained to predict absolute task progress from expert demonstrations.<n>We introduce Robometer, a scalable reward modeling framework that combines intra-trajectory progress supervision with inter-trajectory preference supervision.<n>Robometer is trained with a dual objective: a frame-level progress loss that anchors reward magnitude on expert data, and a trajectory-comparison preference loss that imposes global ordering constraints.
arXiv Detail & Related papers (2026-03-02T17:38:58Z) - Cross-Embodiment Offline Reinforcement Learning for Heterogeneous Robot Datasets [47.55508376631633]
offline reinforcement learning (offline RL) with cross-embodiment learning.<n>We construct a suite of locomotion datasets spanning 16 distinct robot platforms.<n>Experiments confirm that this combined approach excels at pre-training with datasets rich in suboptimal trajectories, outperforming pure behavior cloning.<n>We introduce an embodiment-based grouping strategy in which robots are clustered by morphological similarity and the model is updated with a group gradient.
arXiv Detail & Related papers (2026-02-20T06:39:17Z) - Multi-Embodiment Locomotion at Scale with extreme Embodiment Randomization [16.640420524594443]
We present a single, general locomotion policy trained on a diverse collection of 50 legged robots.<n>By combining an improved embodiment-aware architecture (URMAv2) with a performance-based curriculum for extreme Embodiment Randomization, our policy learns to control millions of morphological variations.
arXiv Detail & Related papers (2025-09-02T20:32:02Z) - Is Diversity All You Need for Scalable Robotic Manipulation? [50.747150672933316]
We investigate the nuanced role of data diversity in robot learning by examining three critical dimensions-task (what to do), embodiment (which robot to use), and expert (who demonstrates)-challenging the conventional intuition of "more diverse is better"<n>We show that task diversity proves more critical than per-task demonstration quantity, benefiting transfer from diverse pre-training tasks to novel downstream scenarios.<n>We propose a distribution debiasing method to mitigate velocity ambiguity, the yielding GO-1-Pro achieves substantial performance gains of 15%, equivalent to using 2.5 times pre-training data.
arXiv Detail & Related papers (2025-07-08T17:52:44Z) - AnyBody: A Benchmark Suite for Cross-Embodiment Manipulation [59.671764778486995]
Generalizing control policies to novel embodiments remains a fundamental challenge in enabling scalable and transferable learning in robotics.<n>We introduce a benchmark for learning cross-embodiment manipulation, focusing on two foundational tasks-reach and push-across a diverse range of morphologies.<n>We evaluate the ability of different RL policies to learn from multiple morphologies and to generalize to novel ones.
arXiv Detail & Related papers (2025-05-21T00:21:38Z) - AgiBot World Colosseo: A Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems [88.05152114775498]
AgiBot World is a large-scale platform comprising over 1 million trajectories across 217 tasks in five deployment scenarios.<n>AgiBot World guarantees high-quality and diverse data distribution.<n>GO-1 exhibits exceptional capability in real-world dexterous and long-horizon tasks.
arXiv Detail & Related papers (2025-03-09T15:40:29Z) - DexterityGen: Foundation Controller for Unprecedented Dexterity [67.15251368211361]
Teaching robots dexterous manipulation skills, such as tool use, presents a significant challenge.<n>Current approaches can be broadly categorized into two strategies: human teleoperation (for imitation learning) and sim-to-real reinforcement learning.<n>We introduce DexterityGen, which uses RL to pretrain large-scale dexterous motion primitives, such as in-hand rotation or translation.<n>In the real world, we use human teleoperation as a prompt to the controller to produce highly dexterous behavior.
arXiv Detail & Related papers (2025-02-06T18:49:35Z) - Universal Actions for Enhanced Embodied Foundation Models [25.755178700280933]
We introduce UniAct, a new embodied foundation modeling framework operating in a Universal Action Space.<n>Our learned universal actions capture the generic atomic behaviors across diverse robots by exploiting their shared structural features.<n>Our 0.5B instantiation of UniAct outperforms 14X larger SOTA embodied foundation models in extensive evaluations on various real-world and simulation robots.
arXiv Detail & Related papers (2025-01-17T10:45:22Z) - Human-Humanoid Robots Cross-Embodiment Behavior-Skill Transfer Using Decomposed Adversarial Learning from Demonstration [9.42179962375058]
We propose a transferable framework that reduces the data bottleneck by using a unified digital human model as a common prototype.<n>The model learns behavior primitives from human demonstrations through adversarial imitation, and complex robot structures are decomposed into functional components.<n>Our framework is validated on five humanoid robots with diverse configurations.
arXiv Detail & Related papers (2024-12-19T18:41:45Z) - The One RING: a Robotic Indoor Navigation Generalist [58.30694487843546]
RING (Robotic Indoor Navigation Generalist) is an embodiment-agnostic policy that turns any mobile robot into an effective indoor semantic navigator.<n>Trained entirely in simulation, RING leverages large-scale randomization over robot embodiments to enable robust generalization to many real-world platforms.
arXiv Detail & Related papers (2024-12-18T23:15:41Z) - GRAM: Generalization in Deep RL with a Robust Adaptation Module [62.662894174616895]
In this work, we present a framework for dynamics generalization in deep reinforcement learning.<n>We introduce a robust adaptation module that provides a mechanism for identifying and reacting to both in-distribution and out-of-distribution environment dynamics.<n>Our algorithm GRAM achieves strong generalization performance across in-distribution and out-of-distribution scenarios upon deployment.
arXiv Detail & Related papers (2024-12-05T16:39:01Z) - Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers [41.069074375686164]
We propose Heterogeneous Pre-trained Transformers (HPT), which pre-train a trunk of a policy neural network to learn a task and embodiment shared representation.
We conduct experiments to investigate the scaling behaviors of training objectives, to the extent of 52 datasets.
HPTs outperform several baselines and enhance the fine-tuned policy performance by over 20% on unseen tasks.
arXiv Detail & Related papers (2024-09-30T17:39:41Z) - Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation [49.03165169369552]
By training a single policy across many different kinds of robots, a robot learning method can leverage much broader and more diverse datasets.
We propose CrossFormer, a scalable and flexible transformer-based policy that can consume data from any embodiment.
We demonstrate that the same network weights can control vastly different robots, including single and dual arm manipulation systems, wheeled robots, quadcopters, and quadrupeds.
arXiv Detail & Related papers (2024-08-21T17:57:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.