Related papers: GRS: Generating Robotic Simulation Tasks from Real-World Images

GRS: Generating Robotic Simulation Tasks from Real-World Images

URL: http://arxiv.org/abs/2410.15536v1
Date: Sun, 20 Oct 2024 23:33:06 GMT
Title: GRS: Generating Robotic Simulation Tasks from Real-World Images
Authors: Alex Zook, Fan-Yun Sun, Josef Spjut, Valts Blukis, Stan Birchfield, Jonathan Tremblay,
Abstract summary: GRS (Generating Robotic Simulation tasks) is a novel system to address the challenge of real-to-sim in robotics, computer vision, and AR/VR. We use state-of-the-art vision-language models (VLMs) to achieve a comprehensive real-to-sim pipeline. GRS operates in three stages: 1) scene comprehension using SAM2 for object segmentation, 2) matching identified objects with simulation-ready assets, and 3) generating contextually appropriate robotic tasks.
Score: 18.64560477109255
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce GRS (Generating Robotic Simulation tasks), a novel system to address the challenge of real-to-sim in robotics, computer vision, and AR/VR. GRS enables the creation of digital twin simulations from single real-world RGB-D observations, complete with diverse, solvable tasks for virtual agent training. We use state-of-the-art vision-language models (VLMs) to achieve a comprehensive real-to-sim pipeline. GRS operates in three stages: 1) scene comprehension using SAM2 for object segmentation and VLMs for object description, 2) matching identified objects with simulation-ready assets, and 3) generating contextually appropriate robotic tasks. Our approach ensures simulations align with task specifications by generating test suites designed to verify adherence to the task specification. We introduce a router that iteratively refines the simulation and test code to ensure the simulation is solvable by a robot policy while remaining aligned to the task specification. Our experiments demonstrate the system's efficacy in accurately identifying object correspondence, which allows us to generate task environments that closely match input environments, and enhance automated simulation task generation through our novel router mechanism.

Related papers

Video2Policy: Scaling up Manipulation Tasks in Simulation through Internet Videos [61.925837909969815]
We introduce Video2Policy, a novel framework that leverages internet RGB videos to reconstruct tasks based on everyday human behavior. Our method can successfully train RL policies on such tasks, including complex and challenging tasks such as throwing. We show that the generated simulation data can be scaled up for training a general policy, and it can be transferred back to the real robot in a Real2Sim2Real way.
arXiv Detail & Related papers (2025-02-14T03:22:03Z)
Dynamics as Prompts: In-Context Learning for Sim-to-Real System Identifications [23.94013806312391]
We propose a novel approach that dynamically adjusts simulation environment parameters online using in-context learning. We validate our approach across two tasks: object scooping and table air hockey. Our approach delivers efficient and smooth system identification, advancing the deployment of robots in dynamic real-world scenarios.
arXiv Detail & Related papers (2024-10-27T07:13:38Z)
DrEureka: Language Model Guided Sim-To-Real Transfer [64.14314476811806]
Transferring policies learned in simulation to the real world is a promising strategy for acquiring robot skills at scale. In this paper, we investigate using Large Language Models (LLMs) to automate and accelerate sim-to-real design. Our approach is capable of solving novel robot tasks, such as quadruped balancing and walking atop a yoga ball.
arXiv Detail & Related papers (2024-06-04T04:53:05Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
Gen2Sim: Scaling up Robot Learning in Simulation with Generative Models [17.757495961816783]
Gen2Sim is a method for scaling up robot skill learning in simulation by automating generation of 3D assets, task descriptions, task decompositions and reward functions. Our work contributes hundreds of simulated assets, tasks and demonstrations, taking a step towards fully autonomous robotic manipulation skill acquisition in simulation.
arXiv Detail & Related papers (2023-10-27T17:55:32Z)
GenSim: Generating Robotic Simulation Tasks via Large Language Models [34.79613485106202]
GenSim aims to automatically generate rich simulation environments and expert demonstrations. We use GPT4 to expand the existing benchmark by ten times to over 100 tasks. With minimal sim-to-real adaptation, multitask policies pretrained on GPT4-generated simulation tasks exhibit stronger transfer to unseen long-horizon tasks in the real world.
arXiv Detail & Related papers (2023-10-02T17:23:48Z)
Learning Sim-to-Real Dense Object Descriptors for Robotic Manipulation [4.7246285569677315]
We present Sim-to-Real Dense Object Nets (SRDONs), a dense object descriptor that not only understands the object via appropriate representation but also maps simulated and real data to a unified feature space with pixel consistency. We demonstrate in experiments that pre-trained SRDONs significantly improve performances on unseen objects and unseen visual environments for various robotic tasks with zero real-world training.
arXiv Detail & Related papers (2023-04-18T02:28:55Z)
AdaptSim: Task-Driven Simulation Adaptation for Sim-to-Real Transfer [10.173835871228718]
AdaptSim aims to optimize task performance in target (real) environments. First, we meta-learn an adaptation policy in simulation using reinforcement learning. We then perform iterative real-world adaptation by inferring new simulation parameter distributions for policy training.
arXiv Detail & Related papers (2023-02-09T19:10:57Z)
SAGCI-System: Towards Sample-Efficient, Generalizable, Compositional, and Incremental Robot Learning [41.19148076789516]
We introduce a systematic learning framework called SAGCI-system towards achieving the above four requirements. Our system first takes the raw point clouds gathered by the camera mounted on the robot's wrist as the inputs and produces initial modeling of the surrounding environment represented as a URDF. The robot then utilizes the interactive perception to interact with the environments to online verify and modify the URDF.
arXiv Detail & Related papers (2021-11-29T16:53:49Z)
DriveGAN: Towards a Controllable High-Quality Neural Simulation [147.6822288981004]
We introduce a novel high-quality neural simulator referred to as DriveGAN. DriveGAN achieves controllability by disentangling different components without supervision. We train DriveGAN on multiple datasets, including 160 hours of real-world driving data.
arXiv Detail & Related papers (2021-04-30T15:30:05Z)
Auto-Tuned Sim-to-Real Transfer [143.44593793640814]
Policies trained in simulation often fail when transferred to the real world. Current approaches to tackle this problem, such as domain randomization, require prior knowledge and engineering. We propose a method for automatically tuning simulator system parameters to match the real world.
arXiv Detail & Related papers (2021-04-15T17:59:55Z)
Reactive Long Horizon Task Execution via Visual Skill and Precondition Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner. We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z)
Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial Observability in Visual Navigation [62.22058066456076]
Reinforcement Learning (RL) represents powerful tools to solve complex robotic tasks. RL does not work directly in the real-world, which is known as the sim-to-real transfer problem. We propose a method that learns on an observation space constructed by point clouds and environment randomization.
arXiv Detail & Related papers (2020-07-27T17:46:59Z)
RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real [74.45688231140689]
We introduce the RL-scene consistency loss for image translation, which ensures that the translation operation is invariant with respect to the Q-values associated with the image. We obtain RL-CycleGAN, a new approach for simulation-to-real-world transfer for reinforcement learning.
arXiv Detail & Related papers (2020-06-16T08:58:07Z)
SAPIEN: A SimulAted Part-based Interactive ENvironment [77.4739790629284]
SAPIEN is a realistic and physics-rich simulated environment that hosts a large-scale set for articulated objects. We evaluate state-of-the-art vision algorithms for part detection and motion attribute recognition as well as demonstrate robotic interaction tasks.
arXiv Detail & Related papers (2020-03-19T00:11:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.