Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons
- URL: http://arxiv.org/abs/2603.02115v1
- Date: Mon, 02 Mar 2026 17:38:58 GMT
- Title: Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons
- Authors: Anthony Liang, Yigit Korkmaz, Jiahui Zhang, Minyoung Hwang, Abrar Anwar, Sidhant Kaushik, Aditya Shah, Alex S. Huang, Luke Zettlemoyer, Dieter Fox, Yu Xiang, Anqi Li, Andreea Bobu, Abhishek Gupta, Stephen Tu, Erdem Biyik, Jesse Zhang,
- Abstract summary: General-purpose robot reward models are typically trained to predict absolute task progress from expert demonstrations.<n>We introduce Robometer, a scalable reward modeling framework that combines intra-trajectory progress supervision with inter-trajectory preference supervision.<n>Robometer is trained with a dual objective: a frame-level progress loss that anchors reward magnitude on expert data, and a trajectory-comparison preference loss that imposes global ordering constraints.
- Score: 69.87766750714945
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: General-purpose robot reward models are typically trained to predict absolute task progress from expert demonstrations, providing only local, frame-level supervision. While effective for expert demonstrations, this paradigm scales poorly to large-scale robotics datasets where failed and suboptimal trajectories are abundant and assigning dense progress labels is ambiguous. We introduce Robometer, a scalable reward modeling framework that combines intra-trajectory progress supervision with inter-trajectory preference supervision. Robometer is trained with a dual objective: a frame-level progress loss that anchors reward magnitude on expert data, and a trajectory-comparison preference loss that imposes global ordering constraints across trajectories of the same task, enabling effective learning from both real and augmented failed trajectories. To support this formulation at scale, we curate RBM-1M, a reward-learning dataset comprising over one million trajectories spanning diverse robot embodiments and tasks, including substantial suboptimal and failure data. Across benchmarks and real-world evaluations, Robometer learns more generalizable reward functions than prior methods and improves robot learning performance across a diverse set of downstream applications. Code, model weights, and videos at https://robometer.github.io/.
Related papers
- RoboGene: Boosting VLA Pre-training via Diversity-Driven Agentic Framework for Real-World Task Generation [37.52152452548065]
RoboGene is an agentic framework designed to automate the generation of diverse, physically plausible manipulation tasks.<n>We conduct extensive quantitative analysis and large-scale real-world experiments, collecting datasets of 18k trajectories.<n>Results demonstrate that RoboGene significantly outperforms state-of-the-art foundation models.
arXiv Detail & Related papers (2026-02-18T13:29:43Z) - RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics [53.053660003572965]
We propose RoboTracer, a 3D-aware VLM that first achieves both 3D spatial referring and measuring.<n>RoboTracer advances multi-step metric-grounded reasoning via reinforcement fine-tuning.<n>We present TraceSpatial-Bench, a challenging benchmark to evaluate spatial tracing.
arXiv Detail & Related papers (2025-12-15T18:52:43Z) - NoTVLA: Narrowing of Dense Action Trajectories for Generalizable Robot Manipulation [54.87964060934928]
Vision-Language-Action (VLA) models confront critical barriers to real-world deployment, most notably catastrophic forgetting.<n>We propose the Narrowing of Trajectory VLA framework: a novel approach that narrows its focus to sparse trajectories.<n>NoTVLA achieves superior performance and generalization compared to pi0 while operating under two critical constraints.
arXiv Detail & Related papers (2025-10-04T18:26:55Z) - Self-Augmented Robot Trajectory: Efficient Imitation Learning via Safe Self-augmentation with Demonstrator-annotated Precision [2.3548641190233264]
Self-Augmented Robot Trajectory (SART) is a framework that enables policy learning from a single human demonstration.<n>SART achieves substantially higher success rates than policies trained solely on human-collected demonstrations.
arXiv Detail & Related papers (2025-09-11T23:10:56Z) - Physical Autoregressive Model for Robotic Manipulation without Action Pretraining [65.8971623698511]
We build upon autoregressive video generation models to propose a Physical Autoregressive Model (PAR)<n>PAR leverages the world knowledge embedded in video pretraining to understand physical dynamics without requiring action pretraining.<n>Experiments on the ManiSkill benchmark show that PAR achieves a 100% success rate on the PushCube task.
arXiv Detail & Related papers (2025-08-13T13:54:51Z) - Action Flow Matching for Continual Robot Learning [54.10050120844738]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks.<n>We introduce a generative framework leveraging flow matching for online robot dynamics model alignment.<n>We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z) - RoboGrasp: A Universal Grasping Policy for Robust Robotic Control [8.189496387470726]
RoboGrasp is a universal grasping policy framework that integrates pretrained grasp detection models with robotic learning.<n>It significantly enhances grasp precision, stability, and generalizability, achieving up to 34% higher success rates in few-shot learning and grasping box prompt tasks.
arXiv Detail & Related papers (2025-02-05T11:04:41Z) - Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics [50.191655141020505]
This work advances model-based reinforcement learning by addressing the challenges of long-horizon prediction, error accumulation, and sim-to-real transfer.<n>By providing a scalable and robust framework, the introduced methods pave the way for adaptive and efficient robotic systems in real-world applications.
arXiv Detail & Related papers (2025-01-17T10:39:09Z) - A Backbone for Long-Horizon Robot Task Understanding [8.889888977376886]
Therblig-Based Backbone Framework (TBBF) is a structure to enhance interpretability, data efficiency, and generalization in robotic systems.<n>TBBF utilizes expert demonstrations to enable therblig-level task decomposition.<n>During the offline training stage, we developed the Meta-RGate SynerFusion network for accurate therblig segmentation.<n>In the online testing stage, after a one-shot demonstration of a new task is collected, our MGSF network extracts high-level knowledge.
arXiv Detail & Related papers (2024-08-02T15:32:42Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.