DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis
- URL: http://arxiv.org/abs/2405.13983v1
- Date: Wed, 22 May 2024 20:39:05 GMT
- Title: DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis
- Authors: Yu Shee, Haote Li, Anton Morgunov, Victor Batista,
- Abstract summary: We introduce a transformer-based model that generates multi-step synthetic routes as a single string by conditionally predicting each molecule based on all preceding ones.
The model accommodates specific conditions such as the desired number of steps and starting materials, outperforming state-of-the-art methods on the PaRoutes dataset.
It also successfully predicts routes for FDA-approved drugs not included in the training data, showcasing its generalization capabilities.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Traditional computer-aided synthesis planning (CASP) methods rely on iterative single-step predictions, leading to exponential search space growth that limits efficiency and scalability. We introduce a transformer-based model that directly generates multi-step synthetic routes as a single string by conditionally predicting each molecule based on all preceding ones. The model accommodates specific conditions such as the desired number of steps and starting materials, outperforming state-of-the-art methods on the PaRoutes dataset with a 2.2x improvement in Top-1 accuracy on the n$_1$ test set and a 3.3x improvement on the n$_5$ test set. It also successfully predicts routes for FDA-approved drugs not included in the training data, showcasing its generalization capabilities. While the current suboptimal diversity of the training set may impact performance on less common reaction types, our approach presents a promising direction towards fully automated retrosynthetic planning.
Related papers
- AI methods for approximate compiling of unitaries [0.0]
This paper explores artificial intelligence (AI) methods for the approximate compiling of unitaries.
We focus on the use of fixed two-qubit gates and arbitrary single-qubit rotations typical in superconducting hardware.
Our approach involves three main stages: identifying an initial template that approximates the target unitary, predicting initial parameters for this template, and refining these parameters to maximize the fidelity of the circuit.
arXiv Detail & Related papers (2024-07-30T22:30:15Z) - Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models [35.314442982529904]
Current data-driven strategies employ one-step retro models and search algorithms to predict synthetic routes in a top-bottom manner.
Existing strategies cannot control the generation of synthetic routes based on possible criteria such as material costs, yields, and step count.
We propose a general and principled framework via conditional residual energy-based models (EBMs) that focus on the quality of the entire synthetic route.
arXiv Detail & Related papers (2024-06-04T07:49:30Z) - Align Your Steps: Optimizing Sampling Schedules in Diffusion Models [63.927438959502226]
Diffusion models (DMs) have established themselves as the state-of-the-art generative modeling approach in the visual domain and beyond.
A crucial drawback of DMs is their slow sampling speed, relying on many sequential function evaluations through large neural networks.
We propose a general and principled approach to optimizing the sampling schedules of DMs for high-quality outputs.
arXiv Detail & Related papers (2024-04-22T18:18:41Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Context-aware Pedestrian Trajectory Prediction with Multimodal
Transformer [16.457778420360537]
We propose a novel solution for predicting future trajectories of pedestrians.
Our method uses a multimodal encoder-decoder transformer architecture, which takes as input both pedestrian locations and ego-vehicle speeds.
We perform detailed experiments and evaluate our method on two popular datasets, PIE and JAAD.
arXiv Detail & Related papers (2023-07-07T18:21:05Z) - Retrosynthetic Planning with Dual Value Networks [107.97218669277913]
We propose a novel online training algorithm, called Planning with Dual Value Networks (PDVN)
PDVN alternates between the planning phase and updating phase to predict the synthesizability and cost of molecules.
On the widely-used USPTO dataset, our PDVN algorithm improves the search success rate of existing multi-step planners.
arXiv Detail & Related papers (2023-01-31T16:43:53Z) - One-Pass Learning via Bridging Orthogonal Gradient Descent and Recursive
Least-Squares [8.443742714362521]
We develop an algorithm for one-pass learning which seeks to perfectly fit every new datapoint while changing the parameters in a direction that causes the least change to the predictions on previous datapoints.
Our algorithm uses the memory efficiently by exploiting the structure of the streaming data via an incremental principal component analysis (IPCA)
Our experiments show the effectiveness of the proposed method compared to the baselines.
arXiv Detail & Related papers (2022-07-28T02:01:31Z) - Back to MLP: A Simple Baseline for Human Motion Prediction [59.18776744541904]
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences.
We show that the performance of these approaches can be surpassed by a light-weight and purely architectural architecture with only 0.14M parameters.
An exhaustive evaluation on Human3.6M, AMASS and 3DPW datasets shows that our method, which we dub siMLPe, consistently outperforms all other approaches.
arXiv Detail & Related papers (2022-07-04T16:35:58Z) - On the Role of Bidirectionality in Language Model Pre-Training [85.14614350372004]
We study the role of bidirectionality in next token prediction, text infilling, zero-shot priming and fine-tuning.
We train models with up to 6.7B parameters, and find differences to remain consistent at scale.
arXiv Detail & Related papers (2022-05-24T02:25:05Z) - Fast, Accurate, and Simple Models for Tabular Data via Augmented
Distillation [97.42894942391575]
We propose FAST-DAD to distill arbitrarily complex ensemble predictors into individual models like boosted trees, random forests, and deep networks.
Our individual distilled models are over 10x faster and more accurate than ensemble predictors produced by AutoML tools like H2O/AutoSklearn.
arXiv Detail & Related papers (2020-06-25T09:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.