ReZero: Boosting MCTS-based Algorithms by Backward-view and Entire-buffer Reanalyze
- URL: http://arxiv.org/abs/2404.16364v4
- Date: Mon, 26 Aug 2024 02:28:14 GMT
- Title: ReZero: Boosting MCTS-based Algorithms by Backward-view and Entire-buffer Reanalyze
- Authors: Chunyu Xuan, Yazhe Niu, Yuan Pu, Shuai Hu, Yu Liu, Jing Yang,
- Abstract summary: We propose a general approach named ReZero to boost tree search operations for Monte Carlo Tree Search algorithms.
Specifically, we reanalyze training samples through a backward-view reuse technique which obtains the value estimation of a certain child node in advance.
Experiments conducted on Atari environments and board games demonstrate that ReZero substantially improves training speed while maintaining high sample efficiency.
- Score: 5.671696366787522
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Monte Carlo Tree Search (MCTS)-based algorithms, such as MuZero and its derivatives, have achieved widespread success in various decision-making domains. These algorithms employ the reanalyze process to enhance sample efficiency from stale data, albeit at the expense of significant wall-clock time consumption. To address this issue, we propose a general approach named ReZero to boost tree search operations for MCTS-based algorithms. Specifically, drawing inspiration from the one-armed bandit model, we reanalyze training samples through a backward-view reuse technique which obtains the value estimation of a certain child node in advance. To further adapt to this design, we periodically reanalyze the entire buffer instead of frequently reanalyzing the mini-batch. The synergy of these two designs can significantly reduce the search cost and meanwhile guarantee or even improve performance, simplifying both data collecting and reanalyzing. Experiments conducted on Atari environments and board games demonstrate that ReZero substantially improves training speed while maintaining high sample efficiency. The code is available as part of the LightZero benchmark at https://github.com/opendilab/LightZero.
Related papers
- Optimizing Tensor Computation Graphs with Equality Saturation and Monte Carlo Tree Search [0.0]
We present a tensor graph rewriting approach that uses Monte Carlo tree search to build superior representation.
Our approach improves the inference speedup of neural networks by up to 11% compared to existing methods.
arXiv Detail & Related papers (2024-10-07T22:22:02Z) - Efficient NeRF Optimization -- Not All Samples Remain Equally Hard [9.404889815088161]
We propose an application of online hard sample mining for efficient training of Neural Radiance Fields (NeRF)
NeRF models produce state-of-the-art quality for many 3D reconstruction and rendering tasks but require substantial computational resources.
arXiv Detail & Related papers (2024-08-06T13:49:01Z) - Posterior Sampling with Delayed Feedback for Reinforcement Learning with
Linear Function Approximation [62.969796245827006]
Delayed-PSVI is an optimistic value-based algorithm that explores the value function space via noise perturbation with posterior sampling.
We show our algorithm achieves $widetildeO(sqrtd3H3 T + d2H2 E[tau]$ worst-case regret in the presence of unknown delays.
We incorporate a gradient-based approximate sampling scheme via Langevin dynamics for Delayed-LPSVI.
arXiv Detail & Related papers (2023-10-29T06:12:43Z) - RLSAC: Reinforcement Learning enhanced Sample Consensus for End-to-End
Robust Estimation [74.47709320443998]
We propose RLSAC, a novel Reinforcement Learning enhanced SAmple Consensus framework for end-to-end robust estimation.
RLSAC employs a graph neural network to utilize both data and memory features to guide exploring directions for sampling the next minimum set.
Our experimental results demonstrate that RLSAC can learn from features to gradually explore a better hypothesis.
arXiv Detail & Related papers (2023-08-10T03:14:19Z) - ReBotNet: Fast Real-time Video Enhancement [59.08038313427057]
Most restoration networks are slow, have high computational bottleneck, and can't be used for real-time video enhancement.
In this work, we design an efficient and fast framework to perform real-time enhancement for practical use-cases like live video calls and video streams.
To evaluate our method, we emulate two new datasets that real-world video call and streaming scenarios, and show extensive results on multiple datasets where ReBotNet outperforms existing approaches with lower computations, reduced memory requirements, and faster inference time.
arXiv Detail & Related papers (2023-03-23T17:58:05Z) - Boosting Tail Neural Network for Realtime Custom Keyword Spotting [2.5137859989323537]
We propose a Boosting Tail Neural Network (BTNN) for improving the performance of Realtime Custom Keyword Spotting (RCKS)
Inspired by Brain Science that a brain is only partly activated for a nerve simulation, numerous machine learning algorithms are developed to use a batch of weak classifiers to resolve arduous problems.
arXiv Detail & Related papers (2022-05-24T13:26:39Z) - Mastering Atari Games with Limited Data [73.6189496825209]
We propose a sample efficient model-based visual RL algorithm built on MuZero, which we name EfficientZero.
Our method achieves 190.4% mean human performance on the Atari 100k benchmark with only two hours of real-time game experience.
This is the first time an algorithm achieves super-human performance on Atari games with such little data.
arXiv Detail & Related papers (2021-10-30T09:13:39Z) - Parallel Actors and Learners: A Framework for Generating Scalable RL
Implementations [14.432131909590824]
Reinforcement Learning (RL) has achieved significant success in application domains such as robotics, games, health care and others.
Current implementations exhibit poor performance due to challenges such as irregular memory accesses and synchronization overheads.
We propose a framework for generating scalable reinforcement learning implementations on multicore systems.
arXiv Detail & Related papers (2021-10-03T21:00:53Z) - Circa: Stochastic ReLUs for Private Deep Learning [6.538025863698682]
We re-think the ReLU computation and propose optimizations for PI tailored to neural networks.
Specifically, we reformulate ReLU as an approximate sign test and introduce a novel truncation method for the sign test.
We demonstrate improvements of up to 4.7x storage and 3x runtime over baseline implementations.
arXiv Detail & Related papers (2021-06-15T22:52:45Z) - FNAS: Uncertainty-Aware Fast Neural Architecture Search [54.49650267859032]
Reinforcement learning (RL)-based neural architecture search (NAS) generally guarantees better convergence yet suffers from the requirement of huge computational resources.
We propose a general pipeline to accelerate the convergence of the rollout process as well as the RL process in NAS.
Experiments on the Mobile Neural Architecture Search (MNAS) search space show the proposed Fast Neural Architecture Search (FNAS) accelerates standard RL-based NAS process by 10x.
arXiv Detail & Related papers (2021-05-25T06:32:52Z) - Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad
Samples [67.11669996924671]
We introduce a simple (one line of code) modification to the Generative Adversarial Network (GAN) training algorithm.
When updating the generator parameters, we zero out the gradient contributions from the elements of the batch that the critic scores as least realistic'
We show that this top-k update' procedure is a generally applicable improvement.
arXiv Detail & Related papers (2020-02-14T19:27:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.