It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design
- URL: http://arxiv.org/abs/2410.11527v1
- Date: Tue, 15 Oct 2024 11:59:51 GMT
- Title: It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design
- Authors: Jeff Guo, Philippe Schwaller,
- Abstract summary: Constrained synthesizability is an unaddressed challenge in generative molecular design.
We propose a novel reward function called TANimoto Group Overlap (TANGO)
TANGO transforms a sparse reward function into a dense and learnable reward function -- crucial for reinforcement learning.
- Score: 0.4037357056611557
- License:
- Abstract: Constrained synthesizability is an unaddressed challenge in generative molecular design. In particular, designing molecules satisfying multi-parameter optimization objectives, while simultaneously being synthesizable and enforcing the presence of specific commercial building blocks in the synthesis. This is practically important for molecule re-purposing, sustainability, and efficiency. In this work, we propose a novel reward function called TANimoto Group Overlap (TANGO), which uses chemistry principles to transform a sparse reward function into a dense and learnable reward function -- crucial for reinforcement learning. TANGO can augment general-purpose molecular generative models to directly optimize for constrained synthesizability while simultaneously optimizing for other properties relevant to drug discovery using reinforcement learning. Our framework is general and addresses starting-material, intermediate, and divergent synthesis constraints. Contrary to most existing works in the field, we show that incentivizing a general-purpose (without any inductive biases) model is a productive approach to navigating challenging optimization scenarios. We demonstrate this by showing that the trained models explicitly learn a desirable distribution. Our framework is the first generative approach to tackle constrained synthesizability.
Related papers
- Generative Artificial Intelligence for Navigating Synthesizable Chemical Space [25.65907958071386]
We introduce SynFormer, a generative modeling framework designed to efficiently explore and navigate synthesizable chemical space.
By incorporating a scalable transformer architecture and a diffusion module for building block selection, SynFormer surpasses existing models in synthesizable molecular design.
arXiv Detail & Related papers (2024-10-04T15:09:05Z) - Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
AliDiff is a novel framework to align pretrained target diffusion models with preferred functional properties.
It can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score.
arXiv Detail & Related papers (2024-07-01T06:10:29Z) - SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints [16.21161274235011]
We introduce SynFlowNet, a GFlowNet model whose action space uses chemical reactions and buyable reactants to sequentially build new molecules.
By incorporating forward synthesis as an explicit constraint of the generative mechanism, we aim at bridging the gap between in silico molecular generation and real world synthesis capabilities.
arXiv Detail & Related papers (2024-05-02T10:15:59Z) - DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization [49.85944390503957]
DecompOpt is a structure-based molecular optimization method based on a controllable and diffusion model.
We show that DecompOpt can efficiently generate molecules with improved properties than strong de novo baselines.
arXiv Detail & Related papers (2024-03-07T02:53:40Z) - Molecule Design by Latent Prompt Transformer [76.2112075557233]
This work explores the challenging problem of molecule design by framing it as a conditional generative modeling task.
We propose a novel generative model comprising three components: (1) a latent vector with a learnable prior distribution; (2) a molecule generation model based on a causal Transformer, which uses the latent vector as a prompt; and (3) a property prediction model that predicts a molecule's target properties and/or constraint values using the latent prompt.
arXiv Detail & Related papers (2024-02-27T03:33:23Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data.
Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z) - MIMOSA: Multi-constraint Molecule Sampling for Molecule Optimization [51.00815310242277]
generative models and reinforcement learning approaches made initial success, but still face difficulties in simultaneously optimizing multiple drug properties.
We propose the MultI-constraint MOlecule SAmpling (MIMOSA) approach, a sampling framework to use input molecule as an initial guess and sample molecules from the target distribution.
arXiv Detail & Related papers (2020-10-05T20:18:42Z) - Scaffold-constrained molecular generation [0.0]
We build on the well-known SMILES-based Recurrent Neural Network (RNN) generative model, with a modified sampling procedure to achieve scaffold-constrained generation.
We showcase the method's ability to perform scaffold-constrained generation on various tasks.
arXiv Detail & Related papers (2020-09-15T15:41:18Z) - Molecular Design in Synthetically Accessible Chemical Space via Deep
Reinforcement Learning [0.0]
We argue that existing generative methods are limited in their ability to favourably shift the distributions of molecular properties during optimization.
We propose a novel Reinforcement Learning framework for molecular design in which an agent learns to directly optimize through a space of synthetically-accessible drug-like molecules.
arXiv Detail & Related papers (2020-04-29T16:29:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.