Navigating the Design Space of Equivariant Diffusion-Based Generative
Models for De Novo 3D Molecule Generation
- URL: http://arxiv.org/abs/2309.17296v2
- Date: Fri, 24 Nov 2023 16:08:38 GMT
- Title: Navigating the Design Space of Equivariant Diffusion-Based Generative
Models for De Novo 3D Molecule Generation
- Authors: Tuan Le, Julian Cremer, Frank No\'e, Djork-Arn\'e Clevert, Kristof
Sch\"utt
- Abstract summary: Deep generative diffusion models are a promising avenue for 3D de novo molecular design in materials science and drug discovery.
We explore the design space of E(3)-equivariant diffusion models, focusing on previously unexplored areas.
We present the EQGAT-diff model, which consistently outperforms established models for the QM9 and GEOM-Drugs datasets.
- Score: 1.3124513975412255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep generative diffusion models are a promising avenue for 3D de novo
molecular design in materials science and drug discovery. However, their
utility is still limited by suboptimal performance on large molecular
structures and limited training data. To address this gap, we explore the
design space of E(3)-equivariant diffusion models, focusing on previously
unexplored areas. Our extensive comparative analysis evaluates the interplay
between continuous and discrete state spaces. From this investigation, we
present the EQGAT-diff model, which consistently outperforms established models
for the QM9 and GEOM-Drugs datasets. Significantly, EQGAT-diff takes continuous
atom positions, while chemical elements and bond types are categorical and uses
time-dependent loss weighting, substantially increasing training convergence,
the quality of generated samples, and inference time. We also showcase that
including chemically motivated additional features like hybridization states in
the diffusion process enhances the validity of generated molecules. To further
strengthen the applicability of diffusion models to limited training data, we
investigate the transferability of EQGAT-diff trained on the large PubChem3D
dataset with implicit hydrogen atoms to target different data distributions.
Fine-tuning EQGAT-diff for just a few iterations shows an efficient
distribution shift, further improving performance throughout data sets.
Finally, we test our model on the Crossdocked data set for structure-based de
novo ligand generation, underlining the importance of our findings showing
state-of-the-art performance on Vina docking scores.
Related papers
- Exploring Discrete Flow Matching for 3D De Novo Molecule Generation [0.0]
Flow matching is a recently proposed generative modeling framework that has achieved impressive performance on a variety of tasks.
We present FlowMol-CTMC, an open-source model that achieves state of the art performance for 3D de novo design with fewer learnable parameters than existing methods.
arXiv Detail & Related papers (2024-11-25T18:27:39Z) - Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation [18.936142688346816]
GapDiff is a training framework that mitigates the data distributional disparity between training and inference.
We conduct experiments using a 3D molecular generation model on the CrossDocked 2020 dataset.
arXiv Detail & Related papers (2024-11-08T10:53:39Z) - Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models.
It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z) - Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
AliDiff is a novel framework to align pretrained target diffusion models with preferred functional properties.
It can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score.
arXiv Detail & Related papers (2024-07-01T06:10:29Z) - DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design [62.68420322996345]
Existing structured-based drug design methods treat all ligand atoms equally.
We propose a new diffusion model, DecompDiff, with decomposed priors over arms and scaffold.
Our approach achieves state-of-the-art performance in generating high-affinity molecules.
arXiv Detail & Related papers (2024-02-26T05:21:21Z) - Learning Joint 2D & 3D Diffusion Models for Complete Molecule Generation [32.66694406638287]
We propose a new joint 2D and 3D diffusion model (JODO) that generates molecules with atom types, formal charges, bond information, and 3D coordinates.
Our model can also be extended for inverse molecular design targeting single or multiple quantum properties.
arXiv Detail & Related papers (2023-05-21T04:49:53Z) - Structure-based Drug Design with Equivariant Diffusion Models [40.73626627266543]
We present DiffSBDD, an SE(3)-equivariant diffusion model that generates novel conditioned on protein pockets.
Our in silico experiments demonstrate that DiffSBDD captures the statistics of the ground truth data effectively.
These results support the assumption that diffusion models represent the complex distribution of structural data more accurately than previous methods.
arXiv Detail & Related papers (2022-10-24T15:51:21Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z) - Pre-training via Denoising for Molecular Property Prediction [53.409242538744444]
We describe a pre-training technique that utilizes large datasets of 3D molecular structures at equilibrium.
Inspired by recent advances in noise regularization, our pre-training objective is based on denoising.
arXiv Detail & Related papers (2022-05-31T22:28:34Z) - Equivariant Diffusion for Molecule Generation in 3D [74.289191525633]
This work introduces a diffusion model for molecule computation generation in 3D that is equivariant to Euclidean transformations.
Experimentally, the proposed method significantly outperforms previous 3D molecular generative methods regarding the quality of generated samples and efficiency at training time.
arXiv Detail & Related papers (2022-03-31T12:52:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.