The Synthesizability of Molecules Proposed by Generative Models
- URL: http://arxiv.org/abs/2002.07007v1
- Date: Mon, 17 Feb 2020 15:41:28 GMT
- Title: The Synthesizability of Molecules Proposed by Generative Models
- Authors: Wenhao Gao, Connor W. Coley
- Abstract summary: Discovery of functional molecules is an expensive and time-consuming process.
One class of techniques of growing interest for early-stage drug discovery is de novo molecular generation and optimization.
These techniques can suggest novel molecular structures intended to maximize a multi-objective function.
However, the utility of these approaches is stymied by ignorance of synthesizability.
- Score: 3.032184156362992
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The discovery of functional molecules is an expensive and time-consuming
process, exemplified by the rising costs of small molecule therapeutic
discovery. One class of techniques of growing interest for early-stage drug
discovery is de novo molecular generation and optimization, catalyzed by the
development of new deep learning approaches. These techniques can suggest novel
molecular structures intended to maximize a multi-objective function, e.g.,
suitability as a therapeutic against a particular target, without relying on
brute-force exploration of a chemical space. However, the utility of these
approaches is stymied by ignorance of synthesizability. To highlight the
severity of this issue, we use a data-driven computer-aided synthesis planning
program to quantify how often molecules proposed by state-of-the-art generative
models cannot be readily synthesized. Our analysis demonstrates that there are
several tasks for which these models generate unrealistic molecular structures
despite performing well on popular quantitative benchmarks. Synthetic
complexity heuristics can successfully bias generation toward
synthetically-tractable chemical space, although doing so necessarily detracts
from the primary objective. This analysis suggests that to improve the utility
of these models in real discovery workflows, new algorithm development is
warranted.
Related papers
- MolMiner: Transformer architecture for fragment-based autoregressive generation of molecular stories [7.366789601705544]
Chemical validity, interpretability of the generation process and flexibility to variable molecular sizes are among some of the remaining challenges for generative models in computational materials design.
We propose an autoregressive approach that decomposes molecular generation into a sequence of discrete and interpretable steps.
Our results show that the model can effectively bias the generation distribution according to the prompted multi-target objective.
arXiv Detail & Related papers (2024-11-10T22:00:55Z) - Latent Chemical Space Searching for Plug-in Multi-objective Molecule Generation [9.442146563809953]
We develop a versatile 'plug-in' molecular generation model that incorporates objectives related to target affinity, drug-likeness, and synthesizability.
We identify PSO-ENP as the optimal variant for multi-objective molecular generation and optimization.
arXiv Detail & Related papers (2024-04-10T02:37:24Z) - UAlign: Pushing the Limit of Template-free Retrosynthesis Prediction with Unsupervised SMILES Alignment [51.49238426241974]
This paper introduces UAlign, a template-free graph-to-sequence pipeline for retrosynthesis prediction.
By combining graph neural networks and Transformers, our method can more effectively leverage the inherent graph structure of molecules.
arXiv Detail & Related papers (2024-03-25T03:23:03Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Accurate Machine Learned Quantum-Mechanical Force Fields for
Biomolecular Simulations [51.68332623405432]
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes.
Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations.
This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations.
arXiv Detail & Related papers (2022-05-17T13:08:28Z) - CELLS: Cost-Effective Evolution in Latent Space for Goal-Directed
Molecular Generation [23.618366377098614]
We propose a cost-effective evolution strategy in latent space, which optimize the molecular latent representation vectors.
We adopt a pre-trained molecular generative model to map the latent and observation spaces.
We conduct extensive experiments on multiple optimization tasks comparing the proposed framework to several advanced techniques.
arXiv Detail & Related papers (2021-11-30T11:02:18Z) - Molecular Attributes Transfer from Non-Parallel Data [57.010952598634944]
We formulate molecular optimization as a style transfer problem and present a novel generative model that could automatically learn internal differences between two groups of non-parallel data.
Experiments on two molecular optimization tasks, toxicity modification and synthesizability improvement, demonstrate that our model significantly outperforms several state-of-the-art methods.
arXiv Detail & Related papers (2021-11-30T06:10:22Z) - ChemoVerse: Manifold traversal of latent spaces for novel molecule
discovery [0.7742297876120561]
It is essential to identify molecular structures with the desired chemical properties.
Recent advances in generative models using neural networks and machine learning are being widely used to design virtual libraries of drug-like compounds.
arXiv Detail & Related papers (2020-09-29T12:11:40Z) - Learning To Navigate The Synthetically Accessible Chemical Space Using
Reinforcement Learning [75.95376096628135]
We propose a novel forward synthesis framework powered by reinforcement learning (RL) for de novo drug design.
In this setup, the agent learns to navigate through the immense synthetically accessible chemical space.
We describe how the end-to-end training in this study represents an important paradigm in radically expanding the synthesizable chemical space.
arXiv Detail & Related papers (2020-04-26T21:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.