Related papers: Inverse Protein Folding Using Deep Bayesian Optimization

Inverse Protein Folding Using Deep Bayesian Optimization

URL: http://arxiv.org/abs/2305.18089v1
Date: Thu, 25 May 2023 02:15:25 GMT
Title: Inverse Protein Folding Using Deep Bayesian Optimization
Authors: Natalie Maus and Yimeng Zeng and Daniel Allen Anderson and Phillip Maffettone and Aaron Solomon and Peyton Greenside and Osbert Bastani and Jacob R. Gardner
Abstract summary: Inverse protein folding has surfaced as an important problem in the "top down", de novo design of proteins. In this paper, we cast the problem of improving generated inverse folds as an optimization problem that we solve using recent advances in "deep" or "latent space" Our approach consistently produces protein sequences with greatly reduced structural error to the target backbone structure as measured by TM score and RMSD.
Score: 18.77797005929986
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Inverse protein folding -- the task of predicting a protein sequence from its backbone atom coordinates -- has surfaced as an important problem in the "top down", de novo design of proteins. Contemporary approaches have cast this problem as a conditional generative modelling problem, where a large generative model over protein sequences is conditioned on the backbone. While these generative models very rapidly produce promising sequences, independent draws from generative models may fail to produce sequences that reliably fold to the correct backbone. Furthermore, it is challenging to adapt pure generative approaches to other settings, e.g., when constraints exist. In this paper, we cast the problem of improving generated inverse folds as an optimization problem that we solve using recent advances in "deep" or "latent space" Bayesian optimization. Our approach consistently produces protein sequences with greatly reduced structural error to the target backbone structure as measured by TM score and RMSD while using fewer computational resources. Additionally, we demonstrate other advantages of an optimization-based approach to the problem, such as the ability to handle constraints.

Related papers

Latent Bayesian Optimization via Autoregressive Normalizing Flows [17.063294409131238]
We propose a Normalizing Flow-based Bayesian Optimization (NF-BO) to solve the value discrepancy problem. Our method demonstrates superior performance in molecule generation tasks, significantly outperforming both traditional and recent LBO approaches.
arXiv Detail & Related papers (2025-04-21T06:36:09Z)
Non-negative Weighted DAG Structure Learning [12.139158398361868]
We address the problem of learning the true DAGs from nodal observations. We propose a DAG recovery algorithm based on the method that is guaranteed to return ar.
arXiv Detail & Related papers (2024-09-12T09:41:29Z)
Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs) GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations. We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z)
Symmetric Tensor Networks for Generative Modeling and Constrained Combinatorial Optimization [72.41480594026815]
Constrained optimization problems abound in industry, from portfolio optimization to logistics. One of the major roadblocks in solving these problems is the presence of non-trivial hard constraints which limit the valid search space. In this work, we encode arbitrary integer-valued equality constraints of the form Ax=b, directly into U(1) symmetric networks (TNs) and leverage their applicability as quantum-inspired generative models.
arXiv Detail & Related papers (2022-11-16T18:59:54Z)
Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization [68.28697120944116]
We train an autoregressive generative model via Meta-Reinforcement Learning to propose promising sequences for selection. We pose this problem as that of finding an optimal policy over a distribution of MDPs induced by sampling subsets of the data. Our in-silico experiments show that meta-learning over such ensembles provides robustness against reward misspecification and achieves competitive results.
arXiv Detail & Related papers (2022-09-13T18:37:27Z)
Combining Genetic Programming and Particle Swarm Optimization to Simplify Rugged Landscapes Exploration [7.25130576615102]
We propose a novel method for constructing a smooth surrogate model of the original function. The proposed algorithm, called the GP-FST-PSO Surrogate Model, achieves satisfactory results in both the search for the global optimum and the production of a visual approximation of the original benchmark function.
arXiv Detail & Related papers (2022-06-07T12:55:04Z)
Generative power of a protein language model trained on multiple sequence alignments [0.5639904484784126]
Computational models starting from large ensembles of evolutionarily related protein sequences capture a representation of protein families. Protein language models trained on multiple sequence alignments, such as MSA Transformer, are highly attractive candidates to this end. We propose and test an iterative method that directly uses the masked language modeling objective to generate sequences using MSA Transformer.
arXiv Detail & Related papers (2022-04-14T16:59:05Z)
EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network. Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)
Intermediate Layer Optimization for Inverse Problems using Deep Generative Models [86.29330440222199]
ILO is a novel optimization algorithm for solving inverse problems with deep generative models. We empirically show that our approach outperforms state-of-the-art methods introduced in StyleGAN-2 and PULSE for a wide range of inverse problems.
arXiv Detail & Related papers (2021-02-15T06:52:22Z)
AdaLead: A simple and robust adaptive greedy search algorithm for sequence design [55.41644538483948]
We develop an easy-to-directed, scalable, and robust evolutionary greedy algorithm (AdaLead) AdaLead is a remarkably strong benchmark that out-competes more complex state of the art approaches in a variety of biologically motivated sequence design challenges.
arXiv Detail & Related papers (2020-10-05T16:40:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.