Retrosynthetic Planning with Dual Value Networks
- URL: http://arxiv.org/abs/2301.13755v3
- Date: Sun, 3 Mar 2024 14:23:21 GMT
- Title: Retrosynthetic Planning with Dual Value Networks
- Authors: Guoqing Liu, Di Xue, Shufang Xie, Yingce Xia, Austin Tripp, Krzysztof
Maziarz, Marwin Segler, Tao Qin, Zongzhang Zhang, Tie-Yan Liu
- Abstract summary: We propose a novel online training algorithm, called Planning with Dual Value Networks (PDVN)
PDVN alternates between the planning phase and updating phase to predict the synthesizability and cost of molecules.
On the widely-used USPTO dataset, our PDVN algorithm improves the search success rate of existing multi-step planners.
- Score: 107.97218669277913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrosynthesis, which aims to find a route to synthesize a target molecule
from commercially available starting materials, is a critical task in drug
discovery and materials design. Recently, the combination of ML-based
single-step reaction predictors with multi-step planners has led to promising
results. However, the single-step predictors are mostly trained offline to
optimize the single-step accuracy, without considering complete routes. Here,
we leverage reinforcement learning (RL) to improve the single-step predictor,
by using a tree-shaped MDP to optimize complete routes. Specifically, we
propose a novel online training algorithm, called Planning with Dual Value
Networks (PDVN), which alternates between the planning phase and updating
phase. In PDVN, we construct two separate value networks to predict the
synthesizability and cost of molecules, respectively. To maintain the
single-step accuracy, we design a two-branch network structure for the
single-step predictor. On the widely-used USPTO dataset, our PDVN algorithm
improves the search success rate of existing multi-step planners (e.g.,
increasing the success rate from 85.79% to 98.95% for Retro*, and reducing the
number of model calls by half while solving 99.47% molecules for RetroGraph).
Additionally, PDVN helps find shorter synthesis routes (e.g., reducing the
average route length from 5.76 to 4.83 for Retro*, and from 5.63 to 4.78 for
RetroGraph). Our code is available at \url{https://github.com/DiXue98/PDVN}.
Related papers
- rule4ml: An Open-Source Tool for Resource Utilization and Latency Estimation for ML Models on FPGA [0.0]
This paper introduces a novel method to predict the resource utilization and inference latency of Neural Networks (NNs) before their synthesis and implementation on FPGA.
We leverage HLS4ML, a tool-flow that helps translate NNs into high-level synthesis (HLS) code.
Our method uses trained regression models for immediate pre-synthesis predictions.
arXiv Detail & Related papers (2024-08-09T19:35:10Z) - DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis [0.0]
We introduce a transformer-based model that generates multi-step synthetic routes as a single string by conditionally predicting each molecule based on all preceding ones.
The model accommodates specific conditions such as the desired number of steps and starting materials, outperforming state-of-the-art methods on the PaRoutes dataset.
It also successfully predicts routes for FDA-approved drugs not included in the training data, showcasing its generalization capabilities.
arXiv Detail & Related papers (2024-05-22T20:39:05Z) - Retrosynthesis Prediction with Local Template Retrieval [112.23386062396622]
Retrosynthesis, which predicts the reactants of a given target molecule, is an essential task for drug discovery.
In this work, we introduce RetroKNN, a local reaction template retrieval method.
We conduct comprehensive experiments on two widely used benchmarks, the USPTO-50K and USPTO-MIT.
arXiv Detail & Related papers (2023-06-07T03:38:03Z) - Back to MLP: A Simple Baseline for Human Motion Prediction [59.18776744541904]
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences.
We show that the performance of these approaches can be surpassed by a light-weight and purely architectural architecture with only 0.14M parameters.
An exhaustive evaluation on Human3.6M, AMASS and 3DPW datasets shows that our method, which we dub siMLPe, consistently outperforms all other approaches.
arXiv Detail & Related papers (2022-07-04T16:35:58Z) - A Hybrid Framework for Sequential Data Prediction with End-to-End
Optimization [0.0]
We investigate nonlinear prediction in an online setting and introduce a hybrid model that effectively mitigates hand-designed features and manual model selection issues.
We employ a recurrent neural network (LSTM) for adaptive feature extraction from sequential data and a gradient boosting machinery (soft GBDT) for effective supervised regression.
We demonstrate the learning behavior of our algorithm on synthetic data and the significant performance improvements over the conventional methods over various real life datasets.
arXiv Detail & Related papers (2022-03-25T17:13:08Z) - Scalable Optimal Transport in High Dimensions for Graph Distances,
Embedding Alignment, and More [7.484063729015126]
We propose two effective log-linear time approximations of the cost matrix for optimal transport.
These approximations enable general log-linear time algorithms for entropy-regularized OT that perform well even for the complex, high-dimensional spaces.
For graph distance regression we propose the graph transport network (GTN), which combines graph neural networks (GNNs) with enhanced Sinkhorn.
arXiv Detail & Related papers (2021-07-14T17:40:08Z) - Exploiting Adam-like Optimization Algorithms to Improve the Performance
of Convolutional Neural Networks [82.61182037130405]
gradient descent (SGD) is the main approach for training deep networks.
In this work, we compare Adam based variants based on the difference between the present and the past gradients.
We have tested ensemble of networks and the fusion with ResNet50 trained with gradient descent.
arXiv Detail & Related papers (2021-03-26T18:55:08Z) - Non-Parametric Adaptive Network Pruning [125.4414216272874]
We introduce non-parametric modeling to simplify the algorithm design.
Inspired by the face recognition community, we use a message passing algorithm to obtain an adaptive number of exemplars.
EPruner breaks the dependency on the training data in determining the "important" filters.
arXiv Detail & Related papers (2021-01-20T06:18:38Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.