Related papers: Protein Design with Guided Discrete Diffusion

Protein Design with Guided Discrete Diffusion

URL: http://arxiv.org/abs/2305.20009v2
Date: Tue, 12 Dec 2023 05:09:38 GMT
Title: Protein Design with Guided Discrete Diffusion
Authors: Nate Gruver, Samuel Stanton, Nathan C. Frey, Tim G. J. Rudner, Isidro Hotzel, Julien Lafrance-Vanasse, Arvind Rajpal, Kyunghyun Cho, and Andrew Gordon Wilson
Abstract summary: A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling. We propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models. NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods.
Score: 67.06148688398677
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A popular approach to protein design is to combine a generative model with a discriminative model for conditional sampling. The generative model samples plausible sequences while the discriminative model guides a search for sequences with high fitness. Given its broad success in conditional sampling, classifier-guided diffusion modeling is a promising foundation for protein design, leading many to develop guided diffusion models for structure with inverse folding to recover sequences. In this work, we propose diffusioN Optimized Sampling (NOS), a guidance method for discrete diffusion models that follows gradients in the hidden states of the denoising network. NOS makes it possible to perform design directly in sequence space, circumventing significant limitations of structure-based methods, including scarce data and challenging inverse design. Moreover, we use NOS to generalize LaMBO, a Bayesian optimization procedure for sequence design that facilitates multiple objectives and edit-based constraints. The resulting method, LaMBO-2, enables discrete diffusions and stronger performance with limited edits through a novel application of saliency maps. We apply LaMBO-2 to a real-world protein design task, optimizing antibodies for higher expression yield and binding affinity to several therapeutic targets under locality and developability constraints, attaining a 99% expression rate and 40% binding rate in exploratory in vitro experiments.

Related papers

Discrete Diffusion Trajectory Alignment via Stepwise Decomposition [70.9024656666945]
We propose a novel preference optimization method for masked discrete diffusion models.<n>Instead of applying the reward on the final output and backpropagating the gradient to the entire discrete denoising process, we decompose the problem into a set of stepwise alignment objectives.<n> Experiments across multiple domains including DNA sequence design, protein inverse folding, and language modeling consistently demonstrate the superiority of our approach.
arXiv Detail & Related papers (2025-07-07T09:52:56Z)
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design [87.58981407469977]
We propose a novel framework for inference-time reward optimization with diffusion models inspired by evolutionary algorithms. Our approach employs an iterative refinement process consisting of two steps in each iteration: noising and reward-guided denoising.
arXiv Detail & Related papers (2025-02-20T17:48:45Z)
Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics [7.873510219469276]
We introduce two novel training methods for discrete diffusion samplers. These methods yield memory-efficient training and achieve state-of-the-art results in unsupervised optimization. We introduce adaptations of SN-NIS and Neural Chain Monte Carlo that enable for the first time the application of discrete diffusion models to this problem.
arXiv Detail & Related papers (2025-02-12T18:59:55Z)
Heuristically Adaptive Diffusion-Model Evolutionary Strategy [1.8299322342860518]
Diffusion Models represent a significant advancement in generative modeling. Our research reveals a fundamental connection between diffusion models and evolutionary algorithms. Our framework marks a major algorithmic transition, offering increased flexibility, precision, and control in evolutionary optimization processes.
arXiv Detail & Related papers (2024-11-20T16:06:28Z)
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design [56.957070405026194]
We propose an algorithm that enables direct backpropagation of rewards through entire trajectories generated by diffusion models. DRAKES can generate sequences that are both natural-like and yield high rewards.
arXiv Detail & Related papers (2024-10-17T15:10:13Z)
Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference. Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable. We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z)
Variational Search Distributions [16.609027794680213]
We develop VSD, a method for conditioning a generative model of discrete, variational designs on a rare desired class. We empirically demonstrate that VSD can outperform existing baseline methods on a set of real sequence-design problems.
arXiv Detail & Related papers (2024-09-10T01:33:31Z)
Neural Flow Diffusion Models: Learnable Forward Process for Improved Diffusion Modelling [2.1779479916071067]
We introduce a novel framework that enhances diffusion models by supporting a broader range of forward processes. We also propose a novel parameterization technique for learning the forward process. Results underscore NFDM's versatility and its potential for a wide range of applications.
arXiv Detail & Related papers (2024-04-19T15:10:54Z)
Diffusion Model for Data-Driven Black-Box Optimization [54.25693582870226]
We focus on diffusion models, a powerful generative AI technology, and investigate their potential for black-box optimization. We study two practical types of labels: 1) noisy measurements of a real-valued reward function and 2) human preference based on pairwise comparisons. Our proposed method reformulates the design optimization problem into a conditional sampling problem, which allows us to leverage the power of diffusion models.
arXiv Detail & Related papers (2024-03-20T00:41:12Z)
Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints [42.47298301874283]
We propose to perform optimization within the data manifold using diffusion models. Depending on the differentiability of the objective function, we propose two different sampling methods. Our method achieves better or comparable performance with previous state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-28T03:09:12Z)
Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs) GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations. We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z)
Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data. Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.