Related papers: Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning

Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning

URL: http://arxiv.org/abs/2006.11751v2
Date: Tue, 23 Jun 2020 00:41:58 GMT
Title: Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning
Authors: Aleksei Petrenko, Zhehui Huang, Tushar Kumar, Gaurav Sukhatme, Vladlen Koltun
Abstract summary: "Sample Factory" is a high- throughput training system optimized for a single-machine setting. Our architecture combines a highly efficient, asynchronous, GPU-based sampler with off-policy correction techniques. We extend Sample Factory to support self-play and population-based training and apply these techniques to train highly capable agents for a multiplayer first-person shooter game.
Score: 68.2099740607854
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Increasing the scale of reinforcement learning experiments has allowed researchers to achieve unprecedented results in both training sophisticated agents for video games, and in sim-to-real transfer for robotics. Typically such experiments rely on large distributed systems and require expensive hardware setups, limiting wider access to this exciting area of research. In this work we aim to solve this problem by optimizing the efficiency and resource utilization of reinforcement learning algorithms instead of relying on distributed computation. We present the "Sample Factory", a high-throughput training system optimized for a single-machine setting. Our architecture combines a highly efficient, asynchronous, GPU-based sampler with off-policy correction techniques, allowing us to achieve throughput higher than $10^5$ environment frames/second on non-trivial control problems in 3D without sacrificing sample efficiency. We extend Sample Factory to support self-play and population-based training and apply these techniques to train highly capable agents for a multiplayer first-person shooter game. The source code is available at https://github.com/alex-petrenko/sample-factory

Related papers

AutoHete: An Automatic and Efficient Heterogeneous Training System for LLMs [68.99086112477565]
Transformer-based large language models (LLMs) have demonstrated exceptional capabilities in sequence modeling and text generation. Existing heterogeneous training methods significantly expand the scale of trainable models but introduce substantial communication overheads and CPU workloads. We propose AutoHete, an automatic and efficient heterogeneous training system compatible with both single- GPU and multi- GPU environments.
arXiv Detail & Related papers (2025-02-27T14:46:22Z)
OccGaussian: 3D Gaussian Splatting for Occluded Human Rendering [55.50438181721271]
Previous method utilizing NeRF for surface rendering to recover the occluded areas requires more than one day to train and several seconds to render occluded areas. We propose OccGaussian based on 3D Gaussian Splatting, which can be trained within 6 minutes and produces high-quality human renderings up to 160 FPS with occluded input.
arXiv Detail & Related papers (2024-04-12T13:00:06Z)
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment. We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation. These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z)
Reinforcement Learning with Foundation Priors: Let the Embodied Agent Efficiently Learn on Its Own [59.11934130045106]
We propose Reinforcement Learning with Foundation Priors (RLFP) to utilize guidance and feedback from policy, value, and success-reward foundation models. Within this framework, we introduce the Foundation-guided Actor-Critic (FAC) algorithm, which enables embodied agents to explore more efficiently with automatic reward functions. Our method achieves remarkable performances in various manipulation tasks on both real robots and in simulation.
arXiv Detail & Related papers (2023-10-04T07:56:42Z)
Efficient Training for Visual Tracking with Deformable Transformer [0.0]
We present DETRack, a streamlined end-to-end visual object tracking framework. Our framework utilizes an efficient encoder-decoder structure where the deformable transformer decoder acting as a target head. For training, we introduce a novel one-to-many label assignment and an auxiliary denoising technique.
arXiv Detail & Related papers (2023-09-06T03:07:43Z)
Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems. We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately. Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z)
Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration [7.930709072852582]
We propose a generic framework for Learning from Demonstration (LfD) based on actor-critic algorithms. We conduct experiments on 4 standard benchmark environments in Mujoco and 2 self-designed robotic environments.
arXiv Detail & Related papers (2021-09-27T12:42:05Z)
Large Batch Simulation for Deep Reinforcement Learning [101.01408262583378]
We accelerate deep reinforcement learning-based training in visually complex 3D environments by two orders of magnitude over prior work. We realize end-to-end training speeds of over 19,000 frames of experience per second on a single and up to 72,000 frames per second on a single eight- GPU machine. By combining batch simulation and performance optimizations, we demonstrate that Point navigation agents can be trained in complex 3D environments on a single GPU in 1.5 days to 97% of the accuracy of agents trained on a prior state-of-the-art system.
arXiv Detail & Related papers (2021-03-12T00:22:50Z)
Accelerating Deep Neuroevolution on Distributed FPGAs for Reinforcement Learning Problems [0.7734726150561088]
We report record training times (running at about 1 million frames per second) for Atari 2600 games using deep neuroevolution implemented on distributed FPGAs. Results are the first application demonstration on the IBM Neural Computer.
arXiv Detail & Related papers (2020-05-10T00:41:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.