Navigating the Design Space of Equivariant Diffusion-Based Generative
Models for De Novo 3D Molecule Generation
- URL: http://arxiv.org/abs/2309.17296v2
- Date: Fri, 24 Nov 2023 16:08:38 GMT
- Title: Navigating the Design Space of Equivariant Diffusion-Based Generative
Models for De Novo 3D Molecule Generation
- Authors: Tuan Le, Julian Cremer, Frank No\'e, Djork-Arn\'e Clevert, Kristof
Sch\"utt
- Abstract summary: Deep generative diffusion models are a promising avenue for 3D de novo molecular design in materials science and drug discovery.
We explore the design space of E(3)-equivariant diffusion models, focusing on previously unexplored areas.
We present the EQGAT-diff model, which consistently outperforms established models for the QM9 and GEOM-Drugs datasets.
- Score: 1.3124513975412255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep generative diffusion models are a promising avenue for 3D de novo
molecular design in materials science and drug discovery. However, their
utility is still limited by suboptimal performance on large molecular
structures and limited training data. To address this gap, we explore the
design space of E(3)-equivariant diffusion models, focusing on previously
unexplored areas. Our extensive comparative analysis evaluates the interplay
between continuous and discrete state spaces. From this investigation, we
present the EQGAT-diff model, which consistently outperforms established models
for the QM9 and GEOM-Drugs datasets. Significantly, EQGAT-diff takes continuous
atom positions, while chemical elements and bond types are categorical and uses
time-dependent loss weighting, substantially increasing training convergence,
the quality of generated samples, and inference time. We also showcase that
including chemically motivated additional features like hybridization states in
the diffusion process enhances the validity of generated molecules. To further
strengthen the applicability of diffusion models to limited training data, we
investigate the transferability of EQGAT-diff trained on the large PubChem3D
dataset with implicit hydrogen atoms to target different data distributions.
Fine-tuning EQGAT-diff for just a few iterations shows an efficient
distribution shift, further improving performance throughout data sets.
Finally, we test our model on the Crossdocked data set for structure-based de
novo ligand generation, underlining the importance of our findings showing
state-of-the-art performance on Vina docking scores.
Related papers
- Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design [30.241533997522236]
We develop context-guided diffusion (CGD), a simple plug-and-play method that leverages unlabeled data and smoothness constraints to improve the out-of-distribution generalization of guided diffusion models.
This approach leads to substantial performance gains across various settings, including continuous, discrete, and graph-structured diffusion processes with applications across drug discovery, materials science, and protein design.
arXiv Detail & Related papers (2024-07-16T17:34:00Z) - Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
We propose a novel and general alignment framework to align pretrained target diffusion models with preferred functional properties, named AliDiff.
AliDiff shifts the target-conditioned chemical distribution towards regions with higher binding affinity and structural rationality, specified by user-defined reward functions.
We show that AliDiff can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score, while maintaining strong molecular properties.
arXiv Detail & Related papers (2024-07-01T06:10:29Z) - Diffusion Models in $\textit{De Novo}$ Drug Design [0.0]
Diffusion models have emerged as powerful tools for molecular generation, particularly in the context of 3D molecular structures.
This review focuses on the technical implementation of diffusion models tailored for 3D molecular generation.
arXiv Detail & Related papers (2024-06-07T06:34:13Z) - DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design [62.68420322996345]
Existing structured-based drug design methods treat all ligand atoms equally.
We propose a new diffusion model, DecompDiff, with decomposed priors over arms and scaffold.
Our approach achieves state-of-the-art performance in generating high-affinity molecules.
arXiv Detail & Related papers (2024-02-26T05:21:21Z) - Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution [67.9215891673174]
We propose score entropy as a novel loss that naturally extends score matching to discrete spaces.
We test our Score Entropy Discrete Diffusion models on standard language modeling tasks.
arXiv Detail & Related papers (2023-10-25T17:59:12Z) - Learning Joint 2D & 3D Diffusion Models for Complete Molecule Generation [32.66694406638287]
We propose a new joint 2D and 3D diffusion model (JODO) that generates molecules with atom types, formal charges, bond information, and 3D coordinates.
Our model can also be extended for inverse molecular design targeting single or multiple quantum properties.
arXiv Detail & Related papers (2023-05-21T04:49:53Z) - 3D Equivariant Diffusion for Target-Aware Molecule Generation and
Affinity Prediction [9.67574543046801]
The inclusion of 3D structures during targeted drug design shows superior performance to other target-free models.
We develop a 3D equivariant diffusion model to solve the above challenges.
Our model could generate molecules with more realistic 3D structures and better affinities towards the protein targets, and improve binding affinity ranking and prediction without retraining.
arXiv Detail & Related papers (2023-03-06T23:01:43Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z) - Pre-training via Denoising for Molecular Property Prediction [53.409242538744444]
We describe a pre-training technique that utilizes large datasets of 3D molecular structures at equilibrium.
Inspired by recent advances in noise regularization, our pre-training objective is based on denoising.
arXiv Detail & Related papers (2022-05-31T22:28:34Z) - Equivariant Diffusion for Molecule Generation in 3D [74.289191525633]
This work introduces a diffusion model for molecule computation generation in 3D that is equivariant to Euclidean transformations.
Experimentally, the proposed method significantly outperforms previous 3D molecular generative methods regarding the quality of generated samples and efficiency at training time.
arXiv Detail & Related papers (2022-03-31T12:52:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.