Large Batch Simulation for Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2103.07013v1
- Date: Fri, 12 Mar 2021 00:22:50 GMT
- Title: Large Batch Simulation for Deep Reinforcement Learning
- Authors: Brennan Shacklett, Erik Wijmans, Aleksei Petrenko, Manolis Savva,
Dhruv Batra, Vladlen Koltun, Kayvon Fatahalian
- Abstract summary: We accelerate deep reinforcement learning-based training in visually complex 3D environments by two orders of magnitude over prior work.
We realize end-to-end training speeds of over 19,000 frames of experience per second on a single and up to 72,000 frames per second on a single eight- GPU machine.
By combining batch simulation and performance optimizations, we demonstrate that Point navigation agents can be trained in complex 3D environments on a single GPU in 1.5 days to 97% of the accuracy of agents trained on a prior state-of-the-art system.
- Score: 101.01408262583378
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We accelerate deep reinforcement learning-based training in visually complex
3D environments by two orders of magnitude over prior work, realizing
end-to-end training speeds of over 19,000 frames of experience per second on a
single GPU and up to 72,000 frames per second on a single eight-GPU machine.
The key idea of our approach is to design a 3D renderer and embodied navigation
simulator around the principle of "batch simulation": accepting and executing
large batches of requests simultaneously. Beyond exposing large amounts of work
at once, batch simulation allows implementations to amortize in-memory storage
of scene assets, rendering work, data loading, and synchronization costs across
many simulation requests, dramatically improving the number of simulated agents
per GPU and overall simulation throughput. To balance DNN inference and
training costs with faster simulation, we also build a computationally
efficient policy DNN that maintains high task performance, and modify training
algorithms to maintain sample efficiency when training with large mini-batches.
By combining batch simulation and DNN performance optimizations, we demonstrate
that PointGoal navigation agents can be trained in complex 3D environments on a
single GPU in 1.5 days to 97% of the accuracy of agents trained on a prior
state-of-the-art system using a 64-GPU cluster over three days. We provide
open-source reference implementations of our batch 3D renderer and simulator to
facilitate incorporation of these ideas into RL systems.
Related papers
- Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous
Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes.
It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training.
We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z) - SciAI4Industry -- Solving PDEs for industry-scale problems with deep
learning [1.642765885524881]
We introduce a distributed programming API for simulating training data in parallel on the cloud without requiring users to manage the underlying HPC infrastructure.
We train large-scale neural networks for solving the 3D Navier-Stokes equation and simulating 3D CO2 flow in porous media.
For the CO2 example, we simulate a training dataset based on a commercial carbon capture and storage (CCS) project and train a neural network for CO2 flow simulation on a 3D grid with over 2 million cells that is 5 orders of magnitudes faster than a conventional numerical simulator and 3,200 times cheaper.
arXiv Detail & Related papers (2022-11-23T05:15:32Z) - Continual learning autoencoder training for a particle-in-cell
simulation via streaming [52.77024349608834]
upcoming exascale era will provide a new generation of physics simulations with high resolution.
These simulations will have a high resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible.
This work presents an approach that trains a neural network concurrently to a running simulation without data on a disk.
arXiv Detail & Related papers (2022-11-09T09:55:14Z) - Parallel Reinforcement Learning Simulation for Visual Quadrotor
Navigation [4.597465975849579]
Reinforcement learning (RL) is an agent-based approach for teaching robots to navigate within the physical world.
We present a simulation framework, built on AirSim, which provides efficient parallel training.
Building on this framework, Ape-X is modified to incorporate decentralised training of AirSim environments.
arXiv Detail & Related papers (2022-09-22T15:27:42Z) - Data-Driven Offline Optimization For Architecting Hardware Accelerators [89.68870139177785]
We develop a data-driven offline optimization method for designing hardware accelerators, dubbed PRIME.
PRIME improves performance upon state-of-the-art simulation-driven methods by about 1.54x and 1.20x, while considerably reducing the required total simulation time by 93% and 99%, respectively.
In addition, PRIME also architects effective accelerators for unseen applications in a zero-shot setting, outperforming simulation-based methods by 1.26x.
arXiv Detail & Related papers (2021-10-20T17:06:09Z) - Accelerating Training and Inference of Graph Neural Networks with Fast
Sampling and Pipelining [58.10436813430554]
Mini-batch training of graph neural networks (GNNs) requires a lot of computation and data movement.
We argue in favor of performing mini-batch training with neighborhood sampling in a distributed multi-GPU environment.
We present a sequence of improvements to mitigate these bottlenecks, including a performance-engineered neighborhood sampler.
We also conduct an empirical analysis that supports the use of sampling for inference, showing that test accuracies are not materially compromised.
arXiv Detail & Related papers (2021-10-16T02:41:35Z) - Scheduling Optimization Techniques for Neural Network Training [3.1617796705744547]
This paper proposes out-of-order (ooo) backprop, an effective scheduling technique for neural network training.
We show that the GPU utilization in single-GPU, data-parallel, and pipeline-parallel training can be commonly improve by applying ooo backprop.
arXiv Detail & Related papers (2021-10-03T05:45:06Z) - Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with
Asynchronous Reinforcement Learning [68.2099740607854]
"Sample Factory" is a high- throughput training system optimized for a single-machine setting.
Our architecture combines a highly efficient, asynchronous, GPU-based sampler with off-policy correction techniques.
We extend Sample Factory to support self-play and population-based training and apply these techniques to train highly capable agents for a multiplayer first-person shooter game.
arXiv Detail & Related papers (2020-06-21T10:00:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.