Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models
- URL: http://arxiv.org/abs/2406.02066v1
- Date: Tue, 4 Jun 2024 07:49:30 GMT
- Title: Preference Optimization for Molecule Synthesis with Conditional Residual Energy-based Models
- Authors: Songtao Liu, Hanjun Dai, Yue Zhao, Peng Liu,
- Abstract summary: Current data-driven strategies employ one-step retro models and search algorithms to predict synthetic routes in a top-bottom manner.
Existing strategies cannot control the generation of synthetic routes based on possible criteria such as material costs, yields, and step count.
We propose a general and principled framework via conditional residual energy-based models (EBMs) that focus on the quality of the entire synthetic route.
- Score: 35.314442982529904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Molecule synthesis through machine learning is one of the fundamental problems in drug discovery. Current data-driven strategies employ one-step retrosynthesis models and search algorithms to predict synthetic routes in a top-bottom manner. Despite their effective performance, these strategies face limitations in the molecule synthetic route generation due to a greedy selection of the next molecule set without any lookahead. Furthermore, existing strategies cannot control the generation of synthetic routes based on possible criteria such as material costs, yields, and step count. In this work, we propose a general and principled framework via conditional residual energy-based models (EBMs), that focus on the quality of the entire synthetic route based on the specific criteria. By incorporating an additional energy-based function into our probabilistic model, our proposed algorithm can enhance the quality of the most probable synthetic routes (with higher probabilities) generated by various strategies in a plug-and-play fashion. Extensive experiments demonstrate that our framework can consistently boost performance across various strategies and outperforms previous state-of-the-art top-1 accuracy by a margin of 2.5%. Code is available at https://github.com/SongtaoLiu0823/CREBM.
Related papers
- Quantum-inspired Reinforcement Learning for Synthesizable Drug Design [20.00111975801053]
We introduce a novel approach using the reinforcement learning method with quantum-inspired simulated annealing policy neural network to navigate the vast discrete space of chemical structures intelligently.
Specifically, we employ a deterministic REINFORCE algorithm using policy neural networks to output transitional probability to guide state transitions and local search.
Our methods are evaluated with the Practical Molecular Optimization (PMO) benchmark framework with a 10K query budget.
arXiv Detail & Related papers (2024-09-13T20:43:16Z) - BatGPT-Chem: A Foundation Large Model For Retrosynthesis Prediction [65.93303145891628]
BatGPT-Chem is a large language model with 15 billion parameters, tailored for enhanced retrosynthesis prediction.
Our model captures a broad spectrum of chemical knowledge, enabling precise prediction of reaction conditions.
This development empowers chemists to adeptly address novel compounds, potentially expediting the innovation cycle in drug manufacturing and materials science.
arXiv Detail & Related papers (2024-08-19T05:17:40Z) - Double-Ended Synthesis Planning with Goal-Constrained Bidirectional Search [27.09693306892583]
We present a formulation of synthesis planning with starting material constraints.
We propose Double-Ended Synthesis Planning (DESP), a novel CASP algorithm under a bidirectional graph search scheme.
DESP can make use of existing one-step retrosynthesis models, and we anticipate its performance to scale as these one-step model capabilities improve.
arXiv Detail & Related papers (2024-07-08T18:56:00Z) - DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis [0.0]
We introduce a transformer-based model that generates multi-step synthetic routes as a single string by conditionally predicting each molecule based on all preceding ones.
The model accommodates specific conditions such as the desired number of steps and starting materials, outperforming state-of-the-art methods on the PaRoutes dataset.
It also successfully predicts routes for FDA-approved drugs not included in the training data, showcasing its generalization capabilities.
arXiv Detail & Related papers (2024-05-22T20:39:05Z) - SynthMix: Mixing up Aligned Synthesis for Medical Cross-Modality Domain
Adaptation [17.10686650166592]
We propose SynthMix, an add-on module with a natural yet effective training policy.
Following the adversarial philosophy of GAN, we designed a mix-up synthesis scheme termed SynthMix.
It coherently mixed up aligned images of real and synthetic samples to stimulate the generation of fine-grained features.
arXiv Detail & Related papers (2023-05-07T01:37:46Z) - Maximum Likelihood Learning of Unnormalized Models for Simulation-Based
Inference [44.281860162298564]
We introduce two synthetic likelihood methods for Simulation-Based Inference.
We learn a conditional energy-based model (EBM) of the likelihood using synthetic data generated by the simulator.
We demonstrate the properties of both methods on a range of synthetic datasets, and apply them to a model of the neuroscience network in the crab.
arXiv Detail & Related papers (2022-10-26T14:38:24Z) - FusionRetro: Molecule Representation Fusion via In-Context Learning for
Retrosynthetic Planning [58.47265392465442]
Retrosynthetic planning aims to devise a complete multi-step synthetic route from starting materials to a target molecule.
Current strategies use a decoupled approach of single-step retrosynthesis models and search algorithms.
We propose a novel framework that utilizes context information for improved retrosynthetic planning.
arXiv Detail & Related papers (2022-09-30T08:44:58Z) - Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data.
Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z) - Energy-based View of Retrosynthesis [70.66156081030766]
We propose a framework that unifies sequence- and graph-based methods as energy-based models.
We present a novel dual variant within the framework that performs consistent training over Bayesian forward- and backward-prediction.
This model improves state-of-the-art performance by 9.6% for template-free approaches where the reaction type is unknown.
arXiv Detail & Related papers (2020-07-14T18:51:06Z) - Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search [83.22850633478302]
Retrosynthetic planning identifies a series of reactions that can lead to the synthesis of a target product.
Existing methods either require expensive return estimation by rollout with high variance, or optimize for search speed rather than the quality.
We propose Retro*, a neural-based A*-like algorithm that finds high-quality synthetic routes efficiently.
arXiv Detail & Related papers (2020-06-29T05:53:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.