DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis
- URL: http://arxiv.org/abs/2405.13983v1
- Date: Wed, 22 May 2024 20:39:05 GMT
- Title: DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis
- Authors: Yu Shee, Haote Li, Anton Morgunov, Victor Batista,
- Abstract summary: We introduce a transformer-based model that generates multi-step synthetic routes as a single string by conditionally predicting each molecule based on all preceding ones.
The model accommodates specific conditions such as the desired number of steps and starting materials, outperforming state-of-the-art methods on the PaRoutes dataset.
It also successfully predicts routes for FDA-approved drugs not included in the training data, showcasing its generalization capabilities.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional computer-aided synthesis planning (CASP) methods rely on iterative single-step predictions, leading to exponential search space growth that limits efficiency and scalability. We introduce a transformer-based model that directly generates multi-step synthetic routes as a single string by conditionally predicting each molecule based on all preceding ones. The model accommodates specific conditions such as the desired number of steps and starting materials, outperforming state-of-the-art methods on the PaRoutes dataset with a 2.2x improvement in Top-1 accuracy on the n$_1$ test set and a 3.3x improvement on the n$_5$ test set. It also successfully predicts routes for FDA-approved drugs not included in the training data, showcasing its generalization capabilities. While the current suboptimal diversity of the training set may impact performance on less common reaction types, our approach presents a promising direction towards fully automated retrosynthetic planning.
Related papers
- Aligning Few-Step Diffusion Models with Dense Reward Difference Learning [81.85515625591884]
Stepwise Diffusion Policy Optimization (SDPO) is an alignment method tailored for few-step diffusion models.
SDPO incorporates dense reward feedback at every intermediate step to ensure consistent alignment across all denoising steps.
SDPO consistently outperforms prior methods in reward-based alignment across diverse step configurations.
arXiv Detail & Related papers (2024-11-18T16:57:41Z) - Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models [35.314442982529904]
Current data-driven strategies employ one-step retro models and search algorithms to predict synthetic routes in a top-bottom manner.
Existing strategies cannot control the generation of synthetic routes based on possible criteria such as material costs, yields, and step count.
We propose a general and principled framework via conditional residual energy-based models (EBMs) that focus on the quality of the entire synthetic route.
arXiv Detail & Related papers (2024-06-04T07:49:30Z) - Align Your Steps: Optimizing Sampling Schedules in Diffusion Models [63.927438959502226]
Diffusion models (DMs) have established themselves as the state-of-the-art generative modeling approach in the visual domain and beyond.
A crucial drawback of DMs is their slow sampling speed, relying on many sequential function evaluations through large neural networks.
We propose a general and principled approach to optimizing the sampling schedules of DMs for high-quality outputs.
arXiv Detail & Related papers (2024-04-22T18:18:41Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Context-aware Pedestrian Trajectory Prediction with Multimodal
Transformer [16.457778420360537]
We propose a novel solution for predicting future trajectories of pedestrians.
Our method uses a multimodal encoder-decoder transformer architecture, which takes as input both pedestrian locations and ego-vehicle speeds.
We perform detailed experiments and evaluate our method on two popular datasets, PIE and JAAD.
arXiv Detail & Related papers (2023-07-07T18:21:05Z) - Retrosynthetic Planning with Dual Value Networks [107.97218669277913]
We propose a novel online training algorithm, called Planning with Dual Value Networks (PDVN)
PDVN alternates between the planning phase and updating phase to predict the synthesizability and cost of molecules.
On the widely-used USPTO dataset, our PDVN algorithm improves the search success rate of existing multi-step planners.
arXiv Detail & Related papers (2023-01-31T16:43:53Z) - Mind the Retrosynthesis Gap: Bridging the divide between Single-step and
Multi-step Retrosynthesis Prediction [0.9134244356393664]
Multi-step approaches repeatedly apply the chemical information stored in single-step retrosynthesis models.
We show that models designed for single-step retrosynthesis, when extended to multi-step, can have a tremendous impact on the route finding capabilities of current multi-step methods.
arXiv Detail & Related papers (2022-12-12T18:06:24Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z) - Fast, Accurate, and Simple Models for Tabular Data via Augmented
Distillation [97.42894942391575]
We propose FAST-DAD to distill arbitrarily complex ensemble predictors into individual models like boosted trees, random forests, and deep networks.
Our individual distilled models are over 10x faster and more accurate than ensemble predictors produced by AutoML tools like H2O/AutoSklearn.
arXiv Detail & Related papers (2020-06-25T09:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.