Related papers: Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design

Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design

URL: http://arxiv.org/abs/2006.10504v3
Date: Tue, 6 Apr 2021 06:33:09 GMT
Title: Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design
Authors: Xiufeng Yang and Tanuj Kr Aasawat and Kazuki Yoshizoe
Abstract summary: We propose a novel massively parallel Monte-Carlo Tree Search (MP-MCTS) algorithm that works efficiently for 1,000 worker scale, and apply it to molecular design. MP-MCTS maintains the search quality at larger scale, and by running MP-MCTS on 256 CPU cores for only 10 minutes, we obtained candidate molecules having similar score to non-parallel MCTS running for 42 hours.
Score: 7.992550355579791
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is common practice to use large computational resources to train neural networks, as is known from many examples, such as reinforcement learning applications. However, while massively parallel computing is often used for training models, it is rarely used for searching solutions for combinatorial optimization problems. In this paper, we propose a novel massively parallel Monte-Carlo Tree Search (MP-MCTS) algorithm that works efficiently for 1,000 worker scale, and apply it to molecular design. This is the first work that applies distributed MCTS to a real-world and non-game problem. Existing work on large-scale parallel MCTS show efficient scalability in terms of the number of rollouts up to 100 workers, but suffer from the degradation in the quality of the solutions. MP-MCTS maintains the search quality at larger scale, and by running MP-MCTS on 256 CPU cores for only 10 minutes, we obtained candidate molecules having similar score to non-parallel MCTS running for 42 hours. Moreover, our results based on parallel MCTS (combined with a simple RNN model) significantly outperforms existing state-of-the-art work. Our method is generic and is expected to speed up other applications of MCTS.

Related papers

Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts [4.629608387540524]
ScMoE is a novel shortcut-connected MoE architecture integrated with an overlapping parallelization strategy.<n>Compared to the prevalent top-2 MoE baseline, ScMoE achieves speedups of 1.49 times in training and 1.82 times in inference.
arXiv Detail & Related papers (2024-04-07T17:17:23Z)
DiPaCo: Distributed Path Composition [31.686642863608558]
We propose a co-designed modular architecture and training approach for machine learning models. During training, DiPaCo distributes by paths through a set of shared modules. At inference time, only a single path needs to be executed for each input, without the need for model compression.
arXiv Detail & Related papers (2024-03-15T18:26:51Z)
Distributed Inference and Fine-tuning of Large Language Models Over The Internet [91.00270820533272]
Large language models (LLMs) are useful in many NLP tasks and become more capable with size. These models require high-end hardware, making them inaccessible to most researchers. We develop fault-tolerant inference algorithms and load-balancing protocols that automatically assign devices to maximize the total system throughput.
arXiv Detail & Related papers (2023-12-13T18:52:49Z)
MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms. Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return. We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z)
Partitioning Distributed Compute Jobs with Reinforcement Learning and Graph Neural Networks [58.720142291102135]
Large-scale machine learning models are bringing advances to a broad range of fields. Many of these models are too large to be trained on a single machine, and must be distributed across multiple devices. We show that maximum parallelisation is sub-optimal in relation to user-critical metrics such as throughput and blocking rate.
arXiv Detail & Related papers (2023-01-31T17:41:07Z)
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient [69.61083127540776]
Deep learning applications benefit from using large models with billions of parameters. Training these models is notoriously expensive due to the need for specialized HPC clusters. We consider alternative setups for training large models: using cheap "preemptible" instances or pooling existing resources from multiple regions.
arXiv Detail & Related papers (2023-01-27T18:55:19Z)
Single MCMC Chain Parallelisation on Decision Trees [0.9137554315375919]
We propose a method to parallelise a single MCMC decision tree chain on an average laptop or personal computer. Experiments showed that we could achieve 18 times faster running time provided that the serial and the parallel implementation are statistically identical.
arXiv Detail & Related papers (2022-07-26T07:07:51Z)
A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules [8.224904698490626]
Multi-Chip-Modules (MCMs) reduce the design and fabrication cost of machine learning accelerators. We present a strategy using a deep reinforcement learning framework to emit a possibly invalid candidate partition that is then corrected by a constraint solver. Our evaluation of a production-scale model, BERT, on real hardware reveals that the partitioning generated using RL policy achieves 6.11% and 5.85% higher throughput.
arXiv Detail & Related papers (2021-12-07T23:40:28Z)
On Effective Parallelization of Monte Carlo Tree Search [51.15940034629022]
Monte Carlo Tree Search (MCTS) is computationally expensive as it requires a substantial number of rollouts to construct the search tree. How to design effective parallel MCTS algorithms has not been systematically studied and remains poorly understood. We demonstrate how proposed necessary conditions can be adopted to design more effective parallel MCTS algorithms.
arXiv Detail & Related papers (2020-06-15T21:36:00Z)
MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical Models [96.1052289276254]
This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle. Surprisingly, by making a small change to the low-performing solver, we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin.
arXiv Detail & Related papers (2020-04-16T16:20:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.