STRIDE: Structure-guided Generation for Inverse Design of Molecules
- URL: http://arxiv.org/abs/2311.06297v1
- Date: Mon, 6 Nov 2023 08:22:35 GMT
- Title: STRIDE: Structure-guided Generation for Inverse Design of Molecules
- Authors: Shehtab Zaman, Denis Akhiyarov, Mauricio Araya-Polo, Kenneth Chiu
- Abstract summary: $textbfSTRIDE$ is a generative molecule workflow that generates novel molecules with an unconditional generative model guided by known molecules without any retraining.
Our generated molecules have on average 21.7% lower synthetic accessibility scores and also reduce ionization potential by 5.9% of generated molecules via guiding.
- Score: 0.24578723416255752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning and especially deep learning has had an increasing impact on
molecule and materials design. In particular, given the growing access to an
abundance of high-quality small molecule data for generative modeling for drug
design, results for drug discovery have been promising. However, for many
important classes of materials such as catalysts, antioxidants, and
metal-organic frameworks, such large datasets are not available. Such families
of molecules with limited samples and structural similarities are especially
prevalent for industrial applications. As is well-known, retraining and even
fine-tuning are challenging on such small datasets. Novel, practically
applicable molecules are most often derivatives of well-known molecules,
suggesting approaches to addressing data scarcity. To address this problem, we
introduce $\textbf{STRIDE}$, a generative molecule workflow that generates
novel molecules with an unconditional generative model guided by known
molecules without any retraining. We generate molecules outside of the training
data from a highly specialized set of antioxidant molecules. Our generated
molecules have on average 21.7% lower synthetic accessibility scores and also
reduce ionization potential by 5.9% of generated molecules via guiding.
Related papers
- Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Domain-Agnostic Molecular Generation with Chemical Feedback [44.063584808910896]
MolGen is a pre-trained molecular language model tailored specifically for molecule generation.
It internalizes structural and grammatical insights through the reconstruction of over 100 million molecular SELFIES.
Our chemical feedback paradigm steers the model away from molecular hallucinations, ensuring alignment between the model's estimated probabilities and real-world chemical preferences.
arXiv Detail & Related papers (2023-01-26T17:52:56Z) - Hybrid Quantum Generative Adversarial Networks for Molecular Simulation
and Drug Discovery [13.544339314714902]
Current classical computational power falls inadequate to simulate any more than small molecules.
Tens of billions of dollars are spent every year in these research experiments.
Deep generative models for graph-structured data provide fresh perspective on the issue of chemical synthesis.
arXiv Detail & Related papers (2022-12-15T13:36:35Z) - DiffBP: Generative Diffusion of 3D Molecules for Target Protein Binding [51.970607704953096]
Previous works usually generate atoms in an auto-regressive way, where element types and 3D coordinates of atoms are generated one by one.
In real-world molecular systems, the interactions among atoms in an entire molecule are global, leading to the energy function pair-coupled among atoms.
In this work, a generative diffusion model for molecular 3D structures based on target proteins is established, at a full-atom level in a non-autoregressive way.
arXiv Detail & Related papers (2022-11-21T07:02:15Z) - Exploring Chemical Space with Score-based Out-of-distribution Generation [57.15855198512551]
We propose a score-based diffusion scheme that incorporates out-of-distribution control in the generative differential equation (SDE)
Since some novel molecules may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor.
We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool.
arXiv Detail & Related papers (2022-06-06T06:17:11Z) - Scalable Fragment-Based 3D Molecular Design with Reinforcement Learning [68.8204255655161]
We introduce a novel framework for scalable 3D design that uses a hierarchical agent to build molecules.
In a variety of experiments, we show that our agent, guided only by energy considerations, can efficiently learn to produce molecules with over 100 atoms.
arXiv Detail & Related papers (2022-02-01T18:54:24Z) - Fragment-based molecular generative model with high generalization
ability and synthetic accessibility [0.0]
We propose a fragment-based molecular generative model which designs new molecules with target properties.
A key feature of our model is a high generalization ability in terms of property control and fragment types.
We show that the model can generate molecules with the simultaneous control of multiple target properties at a high success rate.
arXiv Detail & Related papers (2021-11-25T04:44:37Z) - Structure-aware generation of drug-like molecules [2.449909275410288]
Deep generative methods have shown promise in proposing novel molecules from scratch (de-novo design)
We propose a novel supervised model that generates molecular graphs jointly with 3D pose in a discretised molecular space.
We evaluate our model using a docking benchmark and find that guided generation improves predicted binding affinities by 8% and drug-likeness scores by 10% over the baseline.
arXiv Detail & Related papers (2021-11-07T15:19:54Z) - Learning Latent Space Energy-Based Prior Model for Molecule Generation [59.875533935578375]
We learn latent space energy-based prior model with SMILES representation for molecule modeling.
Our method is able to generate molecules with validity and uniqueness competitive with state-of-the-art models.
arXiv Detail & Related papers (2020-10-19T09:34:20Z) - Self-Supervised Graph Transformer on Large-Scale Molecular Data [73.3448373618865]
We propose a novel framework, GROVER, for molecular representation learning.
GROVER can learn rich structural and semantic information of molecules from enormous unlabelled molecular data.
We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules -- the biggest GNN and the largest training dataset in molecular representation learning.
arXiv Detail & Related papers (2020-06-18T08:37:04Z) - The Synthesizability of Molecules Proposed by Generative Models [3.032184156362992]
Discovery of functional molecules is an expensive and time-consuming process.
One class of techniques of growing interest for early-stage drug discovery is de novo molecular generation and optimization.
These techniques can suggest novel molecular structures intended to maximize a multi-objective function.
However, the utility of these approaches is stymied by ignorance of synthesizability.
arXiv Detail & Related papers (2020-02-17T15:41:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.