ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI
- URL: http://arxiv.org/abs/2410.00425v1
- Date: Tue, 1 Oct 2024 06:10:39 GMT
- Title: ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AI
- Authors: Stone Tao, Fanbo Xiang, Arth Shukla, Yuzhe Qin, Xander Hinrichsen, Xiaodi Yuan, Chen Bao, Xinsong Lin, Yulin Liu, Tse-kai Chan, Yuan Gao, Xuanlin Li, Tongzhou Mu, Nan Xiao, Arnav Gurha, Zhiao Huang, Roberto Calandra, Rui Chen, Shan Luo, Hao Su,
- Abstract summary: ManiSkill3 is the fastest state-visual GPU parallelized robotics simulator with contact-rich physics targeting generalizable manipulation.
ManiSkill3 supports GPU parallelization of many aspects including simulation+rendering, heterogeneous simulation, pointclouds/voxels visual input, and more.
- Score: 27.00155119759743
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Simulation has enabled unprecedented compute-scalable approaches to robot learning. However, many existing simulation frameworks typically support a narrow range of scenes/tasks and lack features critical for scaling generalizable robotics and sim2real. We introduce and open source ManiSkill3, the fastest state-visual GPU parallelized robotics simulator with contact-rich physics targeting generalizable manipulation. ManiSkill3 supports GPU parallelization of many aspects including simulation+rendering, heterogeneous simulation, pointclouds/voxels visual input, and more. Simulation with rendering on ManiSkill3 can run 10-1000x faster with 2-3x less GPU memory usage than other platforms, achieving up to 30,000+ FPS in benchmarked environments due to minimal python/pytorch overhead in the system, simulation on the GPU, and the use of the SAPIEN parallel rendering system. Tasks that used to take hours to train can now take minutes. We further provide the most comprehensive range of GPU parallelized environments/tasks spanning 12 distinct domains including but not limited to mobile manipulation for tasks such as drawing, humanoids, and dextrous manipulation in realistic scenes designed by artists or real-world digital twins. In addition, millions of demonstration frames are provided from motion planning, RL, and teleoperation. ManiSkill3 also provides a comprehensive set of baselines that span popular RL and learning-from-demonstrations algorithms.
Related papers
- Towards a Modern and Lightweight Rendering Engine for Dynamic Robotic Simulations [4.226502078427161]
This paper presents a performance-focused and lightweight rendering engine supporting the Vulkan graphics API.
The engine is designed to modernize the legacy rendering pipeline of Asynchronous Multi-Body Framework (AMBF)
Experiments show that the engine can render a simulated scene with over seven million triangles while maintaining GPU computation times within two milliseconds.
arXiv Detail & Related papers (2024-10-07T14:50:19Z) - Scaling Face Interaction Graph Networks to Real World Scenes [12.519862235430153]
We introduce a method which substantially reduces the memory required to run graph-based learned simulators.
We show that our method uses substantially less memory than previous graph-based simulators while retaining their accuracy.
This paves the way for expanding the application of learned simulators to settings where only perceptual information is available at inference time.
arXiv Detail & Related papers (2024-01-22T14:38:25Z) - EvaSurf: Efficient View-Aware Implicit Textured Surface Reconstruction on Mobile Devices [53.28220984270622]
We present an implicit textured $textbfSurf$ace reconstruction method on mobile devices.
Our method can reconstruct high-quality appearance and accurate mesh on both synthetic and real-world datasets.
Our method can be trained in just 1-2 hours using a single GPU and run on mobile devices at over 40 FPS (Frames Per Second)
arXiv Detail & Related papers (2023-11-16T11:30:56Z) - Learning Interactive Real-World Simulators [96.5991333400566]
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies.
Video captioning models can benefit from training with simulated experience, opening up even wider applications.
arXiv Detail & Related papers (2023-10-09T19:42:22Z) - ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills [24.150758623016195]
We present ManiSkill2, the next generation of the SAPIEN ManiSkill benchmark for generalizable manipulation skills.
ManiSkill2 includes 20 manipulation task families with 2000+ object models and 4M+ demonstration frames.
It defines a unified interface and evaluation protocol to support a wide range of algorithms.
It empowers fast visual input learning algorithms so that a CNN-based policy can collect samples at about 2000 FPS.
arXiv Detail & Related papers (2023-02-09T14:24:01Z) - Optimizing Data Collection in Deep Reinforcement Learning [4.9709347068704455]
GPU vectorization can achieve up to $1024times$ speedup over commonly used CPU simulators.
We show that simulator kernel fusion speedups with a simple simulator are $11.3times$ and increase by up to $1024times$ as simulator complexity increases in terms of memory bandwidth requirements.
arXiv Detail & Related papers (2022-07-15T20:22:31Z) - VRKitchen2.0-IndoorKit: A Tutorial for Augmented Indoor Scene Building
in Omniverse [77.52012928882928]
INDOORKIT is a built-in toolkit for NVIDIA OMNIVERSE.
It provides flexible pipelines for indoor scene building, scene randomizing, and animation controls.
arXiv Detail & Related papers (2022-06-23T17:53:33Z) - Megaverse: Simulating Embodied Agents at One Million Experiences per
Second [75.1191260838366]
We present Megaverse, a new 3D simulation platform for reinforcement learning and embodied AI research.
Megaverse is up to 70x faster than DeepMind Lab in fully-shaded 3D scenes with interactive objects.
We use Megaverse to build a new benchmark that consists of several single-agent and multi-agent tasks.
arXiv Detail & Related papers (2021-07-17T03:16:25Z) - Large Batch Simulation for Deep Reinforcement Learning [101.01408262583378]
We accelerate deep reinforcement learning-based training in visually complex 3D environments by two orders of magnitude over prior work.
We realize end-to-end training speeds of over 19,000 frames of experience per second on a single and up to 72,000 frames per second on a single eight- GPU machine.
By combining batch simulation and performance optimizations, we demonstrate that Point navigation agents can be trained in complex 3D environments on a single GPU in 1.5 days to 97% of the accuracy of agents trained on a prior state-of-the-art system.
arXiv Detail & Related papers (2021-03-12T00:22:50Z) - Multi-GPU SNN Simulation with Perfect Static Load Balancing [0.8360870648463651]
We present a SNN simulator which scales to millions of neurons, billions of synapses, and 8 GPUs.
This is made possible by 1) a novel, cache-aware spike transmission algorithm 2) a model parallel multi- GPU distribution scheme and 3) a static, yet very effective load balancing strategy.
arXiv Detail & Related papers (2021-02-09T07:07:34Z) - Reactive Long Horizon Task Execution via Visual Skill and Precondition
Models [59.76233967614774]
We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner.
We show an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines.
arXiv Detail & Related papers (2020-11-17T15:24:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.