Related papers: Learning Interactive Real-World Simulators

Learning Interactive Real-World Simulators

URL: http://arxiv.org/abs/2310.06114v3
Date: Thu, 26 Sep 2024 17:14:09 GMT
Title: Learning Interactive Real-World Simulators
Authors: Sherry Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Leslie Kaelbling, Dale Schuurmans, Pieter Abbeel,
Abstract summary: We explore the possibility of learning a universal simulator of real-world interaction through generative modeling. We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies. Video captioning models can benefit from training with simulated experience, opening up even wider applications.
Score: 96.5991333400566
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative models trained on internet data have revolutionized how text, image, and video content can be created. Perhaps the next milestone for generative models is to simulate realistic experience in response to actions taken by humans, robots, and other interactive agents. Applications of a real-world simulator range from controllable content creation in games and movies, to training embodied agents purely in simulation that can be directly deployed in the real world. We explore the possibility of learning a universal simulator (UniSim) of real-world interaction through generative modeling. We first make the important observation that natural datasets available for learning a real-world simulator are often rich along different dimensions (e.g., abundant objects in image data, densely sampled actions in robotics data, and diverse movements in navigation data). With careful orchestration of diverse datasets, each providing a different aspect of the overall experience, we can simulate the visual outcome of both high-level instructions such as "open the drawer" and low-level controls from otherwise static scenes and objects. We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies, each of which can be deployed in the real world in zero shot after training purely in simulation. We also show that other types of intelligence such as video captioning models can benefit from training with simulated experience, opening up even wider applications. Video demos can be found at https://universal-simulator.github.io.

Related papers

Sim-and-Real Co-Training: A Simple Recipe for Vision-Based Robotic Manipulation [40.96453435496208]
We present a recipe for utilizing simulation data to solve vision-based robotic manipulation tasks. Using two domains--a robot arm and a humanoid--we demonstrate that simulation data can enhance real-world task performance by an average of 38%.
arXiv Detail & Related papers (2025-03-31T17:39:38Z)
Video2Policy: Scaling up Manipulation Tasks in Simulation through Internet Videos [61.925837909969815]
We introduce Video2Policy, a novel framework that leverages internet RGB videos to reconstruct tasks based on everyday human behavior. Our method can successfully train RL policies on such tasks, including complex and challenging tasks such as throwing. We show that the generated simulation data can be scaled up for training a general policy, and it can be transferred back to the real robot in a Real2Sim2Real way.
arXiv Detail & Related papers (2025-02-14T03:22:03Z)
Robot Learning with Super-Linear Scaling [20.730206708381704]
CASHER is a pipeline for scaling up data collection and learning in simulation where the performance scales superlinearly with human effort. We show that CASHER enables fine-tuning of pre-trained policies to a target scenario using a video scan without any additional human effort.
arXiv Detail & Related papers (2024-12-02T18:12:02Z)
URDFormer: A Pipeline for Constructing Articulated Simulation Environments from Real-World Images [39.0780707100513]
We present an integrated end-to-end pipeline that generates simulation scenes complete with articulated kinematic and dynamic structures from real-world images. In doing so, our work provides both a pipeline for large-scale generation of simulation environments and an integrated system for training robust robotic control policies.
arXiv Detail & Related papers (2024-05-19T20:01:29Z)
Scaling Face Interaction Graph Networks to Real World Scenes [12.519862235430153]
We introduce a method which substantially reduces the memory required to run graph-based learned simulators. We show that our method uses substantially less memory than previous graph-based simulators while retaining their accuracy. This paves the way for expanding the application of learned simulators to settings where only perceptual information is available at inference time.
arXiv Detail & Related papers (2024-01-22T14:38:25Z)
Sim-to-Real via Sim-to-Seg: End-to-end Off-road Autonomous Driving Without Real Data [56.49494318285391]
We present Sim2Seg, a re-imagining of RCAN that crosses the visual reality gap for off-road autonomous driving. This is done by learning to translate randomized simulation images into simulated segmentation and depth maps. This allows us to train an end-to-end RL policy in simulation, and directly deploy in the real-world.
arXiv Detail & Related papers (2022-10-25T17:50:36Z)
DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality [64.51295032956118]
We train a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand. Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups.
arXiv Detail & Related papers (2022-10-25T01:51:36Z)
Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone. Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator. We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z)
DriveGAN: Towards a Controllable High-Quality Neural Simulation [147.6822288981004]
We introduce a novel high-quality neural simulator referred to as DriveGAN. DriveGAN achieves controllability by disentangling different components without supervision. We train DriveGAN on multiple datasets, including 160 hours of real-world driving data.
arXiv Detail & Related papers (2021-04-30T15:30:05Z)
SimAug: Learning Robust Representations from Simulation for Trajectory Prediction [78.91518036949918]
We propose a novel approach to learn robust representation through augmenting the simulation training data. We show that SimAug achieves promising results on three real-world benchmarks using zero real training data.
arXiv Detail & Related papers (2020-04-04T21:22:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.