Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design
- URL: http://arxiv.org/abs/2407.11942v1
- Date: Tue, 16 Jul 2024 17:34:00 GMT
- Title: Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design
- Authors: Leo Klarner, Tim G. J. Rudner, Garrett M. Morris, Charlotte M. Deane, Yee Whye Teh,
- Abstract summary: We develop context-guided diffusion (CGD), a simple plug-and-play method that leverages unlabeled data and smoothness constraints to improve the out-of-distribution generalization of guided diffusion models.
This approach leads to substantial performance gains across various settings, including continuous, discrete, and graph-structured diffusion processes with applications across drug discovery, materials science, and protein design.
- Score: 30.241533997522236
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative models have the potential to accelerate key steps in the discovery of novel molecular therapeutics and materials. Diffusion models have recently emerged as a powerful approach, excelling at unconditional sample generation and, with data-driven guidance, conditional generation within their training domain. Reliably sampling from high-value regions beyond the training data, however, remains an open challenge -- with current methods predominantly focusing on modifying the diffusion process itself. In this paper, we develop context-guided diffusion (CGD), a simple plug-and-play method that leverages unlabeled data and smoothness constraints to improve the out-of-distribution generalization of guided diffusion models. We demonstrate that this approach leads to substantial performance gains across various settings, including continuous, discrete, and graph-structured diffusion processes with applications across drug discovery, materials science, and protein design.
Related papers
- Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation [18.936142688346816]
GapDiff is a training framework that mitigates the data distributional disparity between training and inference.
We conduct experiments using a 3D molecular generation model on the CrossDocked 2020 dataset.
arXiv Detail & Related papers (2024-11-08T10:53:39Z) - Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models.
It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z) - Constrained Diffusion Models via Dual Training [80.03953599062365]
Diffusion processes are prone to generating samples that reflect biases in a training dataset.
We develop constrained diffusion models by imposing diffusion constraints based on desired distributions.
We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z) - Neural Flow Diffusion Models: Learnable Forward Process for Improved Diffusion Modelling [2.1779479916071067]
We introduce a novel framework that enhances diffusion models by supporting a broader range of forward processes.
We also propose a novel parameterization technique for learning the forward process.
Results underscore NFDM's versatility and its potential for a wide range of applications.
arXiv Detail & Related papers (2024-04-19T15:10:54Z) - An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization [59.63880337156392]
Diffusion models have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology.
Despite the significant empirical success, theory of diffusion models is very limited.
This paper provides a well-rounded theoretical exposure for stimulating forward-looking theories and methods of diffusion models.
arXiv Detail & Related papers (2024-04-11T14:07:25Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Fast Sampling via Discrete Non-Markov Diffusion Models [49.598085130313514]
We propose a discrete non-Markov diffusion model, which admits an accelerated reverse sampling for discrete data generation.
Our method significantly reduces the number of function evaluations (i.e., calls to the neural network), making the sampling process much faster.
arXiv Detail & Related papers (2023-12-14T18:14:11Z) - Navigating the Design Space of Equivariant Diffusion-Based Generative
Models for De Novo 3D Molecule Generation [1.3124513975412255]
Deep generative diffusion models are a promising avenue for 3D de novo molecular design in materials science and drug discovery.
We explore the design space of E(3)-equivariant diffusion models, focusing on previously unexplored areas.
We present the EQGAT-diff model, which consistently outperforms established models for the QM9 and GEOM-Drugs datasets.
arXiv Detail & Related papers (2023-09-29T14:53:05Z) - Towards Controllable Diffusion Models via Reward-Guided Exploration [15.857464051475294]
We propose a novel framework that guides the training-phase of diffusion models via reinforcement learning (RL)
RL enables calculating policy gradients via samples from a pay-off distribution proportional to exponential scaled rewards, rather than from policies themselves.
Experiments on 3D shape and molecule generation tasks show significant improvements over existing conditional diffusion models.
arXiv Detail & Related papers (2023-04-14T13:51:26Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.