Exploring Chemical Space with Score-based Out-of-distribution Generation
- URL: http://arxiv.org/abs/2206.07632v3
- Date: Sat, 3 Jun 2023 08:43:39 GMT
- Title: Exploring Chemical Space with Score-based Out-of-distribution Generation
- Authors: Seul Lee, Jaehyeong Jo, Sung Ju Hwang
- Abstract summary: We propose a score-based diffusion scheme that incorporates out-of-distribution control in the generative differential equation (SDE)
Since some novel molecules may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor.
We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool.
- Score: 57.15855198512551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A well-known limitation of existing molecular generative models is that the
generated molecules highly resemble those in the training set. To generate
truly novel molecules that may have even better properties for de novo drug
discovery, more powerful exploration in the chemical space is necessary. To
this end, we propose Molecular Out-Of-distribution Diffusion(MOOD), a
score-based diffusion scheme that incorporates out-of-distribution (OOD)
control in the generative stochastic differential equation (SDE) with simple
control of a hyperparameter, thus requires no additional costs. Since some
novel molecules may not meet the basic requirements of real-world drugs, MOOD
performs conditional generation by utilizing the gradients from a property
predictor that guides the reverse-time diffusion process to high-scoring
regions according to target properties such as protein-ligand interactions,
drug-likeness, and synthesizability. This allows MOOD to search for novel and
meaningful molecules rather than generating unseen yet trivial ones. We
experimentally validate that MOOD is able to explore the chemical space beyond
the training distribution, generating molecules that outscore ones found with
existing methods, and even the top 0.01% of the original training pool. Our
code is available at https://github.com/SeulLee05/MOOD.
Related papers
- Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models.
It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z) - Data-Efficient Molecular Generation with Hierarchical Textual Inversion [48.816943690420224]
We introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecular generation method.
HI-Mol is inspired by the importance of hierarchical information, e.g., both coarse- and fine-grained features, in understanding the molecule distribution.
Compared to the conventional textual inversion method in the image domain using a single-level token embedding, our multi-level token embeddings allow the model to effectively learn the underlying low-shot molecule distribution.
arXiv Detail & Related papers (2024-05-05T08:35:23Z) - Diffusing on Two Levels and Optimizing for Multiple Properties: A Novel
Approach to Generating Molecules with Desirable Properties [33.2976176283611]
We present a novel approach to generating molecules with desirable properties, which expands the diffusion model framework with multiple innovative designs.
To get desirable molecular fragments, we develop a novel electronic effect based fragmentation method.
We show that the molecules generated by our proposed method have better validity, uniqueness, novelty, Fr'echet ChemNet Distance (FCD), QED, and PlogP than those generated by current SOTA models.
arXiv Detail & Related papers (2023-10-05T11:43:21Z) - Molecule Design by Latent Space Energy-Based Modeling and Gradual
Distribution Shifting [53.44684898432997]
Generation of molecules with desired chemical and biological properties is critical for drug discovery.
We propose a probabilistic generative model to capture the joint distribution of molecules and their properties.
Our method achieves very strong performances on various molecule design tasks.
arXiv Detail & Related papers (2023-06-09T03:04:21Z) - Towards Predicting Equilibrium Distributions for Molecular Systems with
Deep Learning [60.02391969049972]
We introduce a novel deep learning framework, called Distributional Graphormer (DiG), in an attempt to predict the equilibrium distribution of molecular systems.
DiG employs deep neural networks to transform a simple distribution towards the equilibrium distribution, conditioned on a descriptor of a molecular system.
arXiv Detail & Related papers (2023-06-08T17:12:08Z) - Hit and Lead Discovery with Explorative RL and Fragment-based Molecule
Generation [34.26748101294543]
We propose a novel framework that generates pharmacochemically acceptable molecules with large docking scores.
Our method constrains the generated molecules to a realistic and qualified chemical space and effectively explores the space to find drugs.
Our model produces molecules of higher quality compared to existing methods while achieving state-of-the-art performance on two of three targets.
arXiv Detail & Related papers (2021-10-04T07:21:00Z) - MIMOSA: Multi-constraint Molecule Sampling for Molecule Optimization [51.00815310242277]
generative models and reinforcement learning approaches made initial success, but still face difficulties in simultaneously optimizing multiple drug properties.
We propose the MultI-constraint MOlecule SAmpling (MIMOSA) approach, a sampling framework to use input molecule as an initial guess and sample molecules from the target distribution.
arXiv Detail & Related papers (2020-10-05T20:18:42Z) - ChemoVerse: Manifold traversal of latent spaces for novel molecule
discovery [0.7742297876120561]
It is essential to identify molecular structures with the desired chemical properties.
Recent advances in generative models using neural networks and machine learning are being widely used to design virtual libraries of drug-like compounds.
arXiv Detail & Related papers (2020-09-29T12:11:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.