RGFN: Synthesizable Molecular Generation Using GFlowNets
- URL: http://arxiv.org/abs/2406.08506v2
- Date: Wed, 06 Nov 2024 21:25:19 GMT
- Title: RGFN: Synthesizable Molecular Generation Using GFlowNets
- Authors: Michał Koziarski, Andrei Rekesh, Dmytro Shevchuk, Almer van der Sloot, Piotr Gaiński, Yoshua Bengio, Cheng-Hao Liu, Mike Tyers, Robert A. Batey,
- Abstract summary: We propose Reaction-GFlowNet, an extension of the GFlowNet framework that operates directly in the space of chemical reactions.
RGFN allows out-of-the-box synthesizability while maintaining comparable quality of generated candidates.
We demonstrate the effectiveness of the proposed approach across a range of oracle models, including pretrained proxy models and GPU-accelerated docking.
- Score: 51.33672611338754
- License:
- Abstract: Generative models hold great promise for small molecule discovery, significantly increasing the size of search space compared to traditional in silico screening libraries. However, most existing machine learning methods for small molecule generation suffer from poor synthesizability of candidate compounds, making experimental validation difficult. In this paper we propose Reaction-GFlowNet (RGFN), an extension of the GFlowNet framework that operates directly in the space of chemical reactions, thereby allowing out-of-the-box synthesizability while maintaining comparable quality of generated candidates. We demonstrate that with the proposed set of reactions and building blocks, it is possible to obtain a search space of molecules orders of magnitude larger than existing screening libraries coupled with low cost of synthesis. We also show that the approach scales to very large fragment libraries, further increasing the number of potential molecules. We demonstrate the effectiveness of the proposed approach across a range of oracle models, including pretrained proxy models and GPU-accelerated docking.
Related papers
- Generative Flows on Synthetic Pathway for Drug Design [39.69010664056235]
We propose RxnFlow, which sequentially assembles molecules using predefined molecular building blocks and chemical reaction templates.
RxnFlow achieves state-of-the-art performance on CrossDocked 2020 for pocket-conditional generation, with an average Vina score of -8.85kcal/mol and 34.8% synthesizability.
arXiv Detail & Related papers (2024-10-06T16:34:01Z) - Bioptic -- A Target-Agnostic Potency-Based Small Molecules Search Engine [0.0]
We develop a target-agnostic, efficacy-based molecule search model.
We screen the ultra-large 40B Enamine REAL library with 100% recall rate.
We benchmarked our model and several state-of-the-art models for both speed performance and retrieval quality of novel molecules.
arXiv Detail & Related papers (2024-06-13T17:53:29Z) - SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints [16.21161274235011]
We introduce SynFlowNet, a GFlowNet model whose action space uses chemical reactions and buyable reactants to sequentially build new molecules.
By incorporating forward synthesis as an explicit constraint of the generative mechanism, we aim at bridging the gap between in silico molecular generation and real world synthesis capabilities.
arXiv Detail & Related papers (2024-05-02T10:15:59Z) - Feedback Efficient Online Fine-Tuning of Diffusion Models [52.170384048274364]
We propose a novel reinforcement learning procedure that efficiently explores on the manifold of feasible samples.
We present a theoretical analysis providing a regret guarantee, as well as empirical validation across three domains.
arXiv Detail & Related papers (2024-02-26T07:24:32Z) - An efficient graph generative model for navigating ultra-large
combinatorial synthesis libraries [1.5495593104596397]
Virtual, make-on-demand chemical libraries have transformed early-stage drug discovery by unlocking vast, synthetically accessible regions of chemical space.
Recent years have witnessed rapid growth in these libraries from millions to trillions of compounds, hiding undiscovered, potent hits for a variety of therapeutic targets.
We propose the Combinatorial Synthesis Library Variational Auto-Encoder (CSLVAE) to overcome these challenges.
arXiv Detail & Related papers (2022-10-19T15:43:13Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - FastFlows: Flow-Based Models for Molecular Graph Generation [4.9252608053969675]
FastFlows generates thousands of chemically valid molecules in seconds.
Our model is significantly simpler and easier to train than autoregressive molecular generative models.
arXiv Detail & Related papers (2022-01-28T21:08:31Z) - RetroGNN: Approximating Retrosynthesis by Graph Neural Networks for De
Novo Drug Design [75.14290780116002]
We train deep graph neural networks to approximate the outputs of a retrosynthesis planning software.
Our approach finds molecules predicted to be more likely to be antibiotics while maintaining good drug-like properties and being easily synthesizable.
arXiv Detail & Related papers (2020-11-25T22:04:16Z) - MIMOSA: Multi-constraint Molecule Sampling for Molecule Optimization [51.00815310242277]
generative models and reinforcement learning approaches made initial success, but still face difficulties in simultaneously optimizing multiple drug properties.
We propose the MultI-constraint MOlecule SAmpling (MIMOSA) approach, a sampling framework to use input molecule as an initial guess and sample molecules from the target distribution.
arXiv Detail & Related papers (2020-10-05T20:18:42Z) - ASGN: An Active Semi-supervised Graph Neural Network for Molecular
Property Prediction [61.33144688400446]
We propose a novel framework called Active Semi-supervised Graph Neural Network (ASGN) by incorporating both labeled and unlabeled molecules.
In the teacher model, we propose a novel semi-supervised learning method to learn general representation that jointly exploits information from molecular structure and molecular distribution.
At last, we proposed a novel active learning strategy in terms of molecular diversities to select informative data during the whole framework learning.
arXiv Detail & Related papers (2020-07-07T04:22:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.