Protein structure generation via folding diffusion
- URL: http://arxiv.org/abs/2209.15611v1
- Date: Fri, 30 Sep 2022 17:35:53 GMT
- Title: Protein structure generation via folding diffusion
- Authors: Kevin E. Wu, Kevin K. Yang, Rianne van den Berg, James Y. Zou, Alex X.
Lu, Ava P. Amini
- Abstract summary: We present a new diffusion-based generative model that designs protein backbone structures.
We generate new structures by denoising from a random, unfolded state towards a stable folded structure.
As a useful resource, we release the first open-source and trained models for protein structure diffusion.
- Score: 16.12124223972183
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The ability to computationally generate novel yet physically foldable protein
structures could lead to new biological discoveries and new treatments
targeting yet incurable diseases. Despite recent advances in protein structure
prediction, directly generating diverse, novel protein structures from neural
networks remains difficult. In this work, we present a new diffusion-based
generative model that designs protein backbone structures via a procedure that
mirrors the native folding process. We describe protein backbone structure as a
series of consecutive angles capturing the relative orientation of the
constituent amino acid residues, and generate new structures by denoising from
a random, unfolded state towards a stable folded structure. Not only does this
mirror how proteins biologically twist into energetically favorable
conformations, the inherent shift and rotational invariance of this
representation crucially alleviates the need for complex equivariant networks.
We train a denoising diffusion probabilistic model with a simple transformer
backbone and demonstrate that our resulting model unconditionally generates
highly realistic protein structures with complexity and structural patterns
akin to those of naturally-occurring proteins. As a useful resource, we release
the first open-source codebase and trained models for protein structure
diffusion.
Related papers
- SaDiT: Efficient Protein Backbone Design via Latent Structural Tokenization and Diffusion Transformers [50.18388227899971]
We present SaDiT, a novel framework that accelerates protein backbone generation by integrating SaProt Tokenization with a Diffusion Transformer (DiT) architecture.<n>Experiments demonstrate that SaDiT outperforms state-of-the-art models, including RFDiffusion and Proteina, in both computational speed and structural viability.
arXiv Detail & Related papers (2026-02-06T13:50:13Z) - Protein Autoregressive Modeling via Multiscale Structure Generation [51.92004892768298]
We present protein autoregressive modeling (PAR), the first multi-scale autoregressive framework for protein backbone generation.<n>We adopt noisy context learning and scheduled sampling, enabling robust backbone generation.<n>On the unconditional generation benchmark, PAR effectively learns protein distributions and produces backbones of high design quality.
arXiv Detail & Related papers (2026-02-04T18:59:49Z) - ProteinAE: Protein Diffusion Autoencoders for Structure Encoding [64.77182442408254]
We introduce ProteinAE, a novel and streamlined protein diffusion autoencoder.<n>ProteinAE directly maps protein backbone coordinates from E(3) into a continuous, compact latent space.<n>We demonstrate that ProteinAE achieves state-of-the-art reconstruction quality, outperforming existing autoencoders.
arXiv Detail & Related papers (2025-10-12T14:30:32Z) - Let Physics Guide Your Protein Flows: Topology-aware Unfolding and Generation [42.116704617358636]
Diffusion-based generative models have revolutionized protein design, enabling the creation of novel proteins.<n>We introduce a physically motivated non-linear noising process, grounded in classical physics, that unfolds proteins into secondary structures.<n>We then integrate this process with the flow-matching paradigm on SE(3) to model the invariant distribution of protein backbones with high fidelity.
arXiv Detail & Related papers (2025-09-29T18:31:22Z) - Mask prior-guided denoising diffusion improves inverse protein folding [3.1373465343833704]
Inverse protein folding generates valid amino acid sequences that can fold into a desired protein structure.<n>To tackle such low-confidence residue prediction, we propose a Mask-prior-guided denoising Diffusion framework.<n>MapDiff is a discrete diffusion probabilistic model that iteratively generates amino acid sequences with reduced noise.
arXiv Detail & Related papers (2024-12-10T09:10:28Z) - 4D Diffusion for Dynamic Protein Structure Prediction with Reference Guided Motion Alignment [18.90451943620277]
This study introduces an innovative 4D diffusion model incorporating molecular dynamics (MD) simulation data to learn dynamic protein structures.
To our knowledge, this is the first diffusion-based model aimed at predicting protein trajectories across multiple time steps simultaneously.
arXiv Detail & Related papers (2024-08-22T14:12:50Z) - Protein Conformation Generation via Force-Guided SE(3) Diffusion Models [48.48934625235448]
Deep generative modeling techniques have been employed to generate novel protein conformations.
We propose a force-guided SE(3) diffusion model, ConfDiff, for protein conformation generation.
arXiv Detail & Related papers (2024-03-21T02:44:08Z) - A Protein Structure Prediction Approach Leveraging Transformer and CNN
Integration [4.909112037834705]
This paper adopts a two-dimensional fusion deep neural network model, DstruCCN, which uses Convolutional Neural Networks (CCN) and a supervised Transformer protein language model for single-sequence protein structure prediction.
The training features of the two are combined to predict the protein Transformer binding site matrix, and then the three-dimensional structure is reconstructed using energy minimization.
arXiv Detail & Related papers (2024-02-29T12:24:20Z) - A Latent Diffusion Model for Protein Structure Generation [50.74232632854264]
We propose a latent diffusion model that can reduce the complexity of protein modeling.
We show that our method can effectively generate novel protein backbone structures with high designability and efficiency.
arXiv Detail & Related papers (2023-05-06T19:10:19Z) - Generating Novel, Designable, and Diverse Protein Structures by
Equivariantly Diffusing Oriented Residue Clouds [0.0]
Structure-based protein design aims to find structures that are designable, novel, and diverse.
Generative models provide a compelling alternative, by implicitly learning the low-dimensional structure of complex data.
We develop Genie, a generative model of protein structures that performs discrete-time diffusion using a cloud of oriented reference frames in 3D space.
arXiv Detail & Related papers (2023-01-29T16:44:19Z) - Protein Sequence and Structure Co-Design with Equivariant Translation [19.816174223173494]
Existing approaches generate both protein sequence and structure using either autoregressive models or diffusion models.
We propose a new approach capable of protein sequence and structure co-design, which iteratively translates both protein sequence and structure into the desired state.
Our model consists of a trigonometry-aware encoder that reasons geometrical constraints and interactions from context features.
All protein amino acids are updated in one shot in each translation step, which significantly accelerates the inference process.
arXiv Detail & Related papers (2022-10-17T06:00:12Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z) - Transfer Learning for Protein Structure Classification at Low Resolution [124.5573289131546]
We show that it is possible to make accurate ($geq$80%) predictions of protein class and architecture from structures determined at low ($leq$3A) resolution.
We provide proof of concept for high-speed, low-cost protein structure classification at low resolution, and a basis for extension to prediction of function.
arXiv Detail & Related papers (2020-08-11T15:01:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.