Related papers: Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design

Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design

URL: http://arxiv.org/abs/2407.11942v1
Date: Tue, 16 Jul 2024 17:34:00 GMT
Title: Context-Guided Diffusion for Out-of-Distribution Molecular and Protein Design
Authors: Leo Klarner, Tim G. J. Rudner, Garrett M. Morris, Charlotte M. Deane, Yee Whye Teh,
Abstract summary: We develop context-guided diffusion (CGD), a simple plug-and-play method that leverages unlabeled data and smoothness constraints to improve the out-of-distribution generalization of guided diffusion models. This approach leads to substantial performance gains across various settings, including continuous, discrete, and graph-structured diffusion processes with applications across drug discovery, materials science, and protein design.
Score: 30.241533997522236
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative models have the potential to accelerate key steps in the discovery of novel molecular therapeutics and materials. Diffusion models have recently emerged as a powerful approach, excelling at unconditional sample generation and, with data-driven guidance, conditional generation within their training domain. Reliably sampling from high-value regions beyond the training data, however, remains an open challenge -- with current methods predominantly focusing on modifying the diffusion process itself. In this paper, we develop context-guided diffusion (CGD), a simple plug-and-play method that leverages unlabeled data and smoothness constraints to improve the out-of-distribution generalization of guided diffusion models. We demonstrate that this approach leads to substantial performance gains across various settings, including continuous, discrete, and graph-structured diffusion processes with applications across drug discovery, materials science, and protein design.

Related papers

Iterative Distillation for Reward-Guided Fine-Tuning of Diffusion Models in Biomolecular Design [53.93023688824764]
We address the problem of fine-tuning diffusion models for reward-guided generation in biomolecular design.<n>We propose an iterative distillation-based fine-tuning framework that enables diffusion models to optimize for arbitrary reward functions.<n>Our off-policy formulation, combined with KL divergence minimization, enhances training stability and sample efficiency compared to existing RL-based methods.
arXiv Detail & Related papers (2025-07-01T05:55:28Z)
Simple Guidance Mechanisms for Discrete Diffusion Models [44.377206440698586]
We develop a new class of diffusion models that leverage uniform noise and that are more guidable because they can continuously edit their outputs. We improve the quality of these models with a novel continuous-time variational lower bound that yields state-of-the-art performance.
arXiv Detail & Related papers (2024-12-13T15:08:30Z)
Bridging the Gap between Learning and Inference for Diffusion-Based Molecule Generation [18.936142688346816]
GapDiff is a training framework that mitigates the data distributional disparity between training and inference. We conduct experiments using a 3D molecular generation model on the CrossDocked 2020 dataset.
arXiv Detail & Related papers (2024-11-08T10:53:39Z)
Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models. It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z)
Constrained Diffusion Models via Dual Training [80.03953599062365]
Diffusion processes are prone to generating samples that reflect biases in a training dataset. We develop constrained diffusion models by imposing diffusion constraints based on desired distributions. We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z)
LDMol: A Text-to-Molecule Diffusion Model with Structurally Informative Latent Space Surpasses AR Models [55.5427001668863]
We present a novel latent diffusion model dubbed LDMol for text-conditioned molecule generation.<n> Experiments show that LDMol outperforms the existing autoregressive baselines on the text-to-molecule generation benchmark.<n>We show that LDMol can be applied to downstream tasks such as molecule-to-text retrieval and text-guided molecule editing.
arXiv Detail & Related papers (2024-05-28T04:59:13Z)
Neural Flow Diffusion Models: Learnable Forward Process for Improved Diffusion Modelling [2.1779479916071067]
We introduce a novel framework that enhances diffusion models by supporting a broader range of forward processes. We also propose a novel parameterization technique for learning the forward process. Results underscore NFDM's versatility and its potential for a wide range of applications.
arXiv Detail & Related papers (2024-04-19T15:10:54Z)
An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization [59.63880337156392]
Diffusion models have achieved tremendous success in computer vision, audio, reinforcement learning, and computational biology. Despite the significant empirical success, theory of diffusion models is very limited. This paper provides a well-rounded theoretical exposure for stimulating forward-looking theories and methods of diffusion models.
arXiv Detail & Related papers (2024-04-11T14:07:25Z)
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z)
Fast Sampling via Discrete Non-Markov Diffusion Models [49.598085130313514]
We propose a discrete non-Markov diffusion model, which admits an accelerated reverse sampling for discrete data generation. Our method significantly reduces the number of function evaluations (i.e., calls to the neural network), making the sampling process much faster.
arXiv Detail & Related papers (2023-12-14T18:14:11Z)
Navigating the Design Space of Equivariant Diffusion-Based Generative Models for De Novo 3D Molecule Generation [1.3124513975412255]
Deep generative diffusion models are a promising avenue for 3D de novo molecular design in materials science and drug discovery. We explore the design space of E(3)-equivariant diffusion models, focusing on previously unexplored areas. We present the EQGAT-diff model, which consistently outperforms established models for the QM9 and GEOM-Drugs datasets.
arXiv Detail & Related papers (2023-09-29T14:53:05Z)
Towards Controllable Diffusion Models via Reward-Guided Exploration [15.857464051475294]
We propose a novel framework that guides the training-phase of diffusion models via reinforcement learning (RL) RL enables calculating policy gradients via samples from a pay-off distribution proportional to exponential scaled rewards, rather than from policies themselves. Experiments on 3D shape and molecule generation tasks show significant improvements over existing conditional diffusion models.
arXiv Detail & Related papers (2023-04-14T13:51:26Z)
A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models. They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space. This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.