Related papers: The Synthesizability of Molecules Proposed by Generative Models

The Synthesizability of Molecules Proposed by Generative Models

URL: http://arxiv.org/abs/2002.07007v1
Date: Mon, 17 Feb 2020 15:41:28 GMT
Title: The Synthesizability of Molecules Proposed by Generative Models
Authors: Wenhao Gao, Connor W. Coley
Abstract summary: Discovery of functional molecules is an expensive and time-consuming process. One class of techniques of growing interest for early-stage drug discovery is de novo molecular generation and optimization. These techniques can suggest novel molecular structures intended to maximize a multi-objective function. However, the utility of these approaches is stymied by ignorance of synthesizability.
Score: 3.032184156362992
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The discovery of functional molecules is an expensive and time-consuming process, exemplified by the rising costs of small molecule therapeutic discovery. One class of techniques of growing interest for early-stage drug discovery is de novo molecular generation and optimization, catalyzed by the development of new deep learning approaches. These techniques can suggest novel molecular structures intended to maximize a multi-objective function, e.g., suitability as a therapeutic against a particular target, without relying on brute-force exploration of a chemical space. However, the utility of these approaches is stymied by ignorance of synthesizability. To highlight the severity of this issue, we use a data-driven computer-aided synthesis planning program to quantify how often molecules proposed by state-of-the-art generative models cannot be readily synthesized. Our analysis demonstrates that there are several tasks for which these models generate unrealistic molecular structures despite performing well on popular quantitative benchmarks. Synthetic complexity heuristics can successfully bias generation toward synthetically-tractable chemical space, although doing so necessarily detracts from the primary objective. This analysis suggests that to improve the utility of these models in real discovery workflows, new algorithm development is warranted.

Related papers

Diffusion Models for Molecules: A Survey of Methods and Tasks [56.44565051667812]
Generative tasks about molecules are crucial for drug discovery and material design. Diffusion models have emerged as an impressive class of deep generative models. This paper conducts a comprehensive survey of diffusion model-based molecular generative methods.
arXiv Detail & Related papers (2025-02-13T17:22:50Z)
MolMiner: Transformer architecture for fragment-based autoregressive generation of molecular stories [7.366789601705544]
Chemical validity, interpretability of the generation process and flexibility to variable molecular sizes are among some of the remaining challenges for generative models in computational materials design. We propose an autoregressive approach that decomposes molecular generation into a sequence of discrete and interpretable steps. Our results show that the model can effectively bias the generation distribution according to the prompted multi-target objective.
arXiv Detail & Related papers (2024-11-10T22:00:55Z)
Latent Chemical Space Searching for Plug-in Multi-objective Molecule Generation [9.442146563809953]
We develop a versatile 'plug-in' molecular generation model that incorporates objectives related to target affinity, drug-likeness, and synthesizability. We identify PSO-ENP as the optimal variant for multi-objective molecular generation and optimization.
arXiv Detail & Related papers (2024-04-10T02:37:24Z)
UAlign: Pushing the Limit of Template-free Retrosynthesis Prediction with Unsupervised SMILES Alignment [51.49238426241974]
This paper introduces UAlign, a template-free graph-to-sequence pipeline for retrosynthesis prediction. By combining graph neural networks and Transformers, our method can more effectively leverage the inherent graph structure of molecules.
arXiv Detail & Related papers (2024-03-25T03:23:03Z)
STRIDE: Structure-guided Generation for Inverse Design of Molecules [0.24578723416255752]
$textbfSTRIDE$ is a generative molecule workflow that generates novel molecules with an unconditional generative model guided by known molecules without any retraining. Our generated molecules have on average 21.7% lower synthetic accessibility scores and also reduce ionization potential by 5.9% of generated molecules via guiding.
arXiv Detail & Related papers (2023-11-06T08:22:35Z)
Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction. Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations. On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z)
Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation. We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria. Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z)
Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations [51.68332623405432]
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes. Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations. This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations.
arXiv Detail & Related papers (2022-05-17T13:08:28Z)
CELLS: Cost-Effective Evolution in Latent Space for Goal-Directed Molecular Generation [23.618366377098614]
We propose a cost-effective evolution strategy in latent space, which optimize the molecular latent representation vectors. We adopt a pre-trained molecular generative model to map the latent and observation spaces. We conduct extensive experiments on multiple optimization tasks comparing the proposed framework to several advanced techniques.
arXiv Detail & Related papers (2021-11-30T11:02:18Z)
Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data. Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z)
ChemoVerse: Manifold traversal of latent spaces for novel molecule discovery [0.7742297876120561]
It is essential to identify molecular structures with the desired chemical properties. Recent advances in generative models using neural networks and machine learning are being widely used to design virtual libraries of drug-like compounds.
arXiv Detail & Related papers (2020-09-29T12:11:40Z)
Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design. In this setup, the agent learns to navigate through the immense synthetically accessible chemical space. We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.