SE(3) diffusion model with application to protein backbone generation
- URL: http://arxiv.org/abs/2302.02277v3
- Date: Mon, 22 May 2023 20:51:32 GMT
- Title: SE(3) diffusion model with application to protein backbone generation
- Authors: Jason Yim, Brian L. Trippe, Valentin De Bortoli, Emile Mathieu, Arnaud
Doucet, Regina Barzilay, Tommi Jaakkola
- Abstract summary: We develop theoretical foundations of SE(3) invariant diffusion models on multiple frames followed by a novel framework, FrameDiff, for learning the SE(3) equivariant score over multiple frames.
We find our samples are capable of generalizing beyond any known protein structure.
- Score: 44.49148900897113
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The design of novel protein structures remains a challenge in protein
engineering for applications across biomedicine and chemistry. In this line of
work, a diffusion model over rigid bodies in 3D (referred to as frames) has
shown success in generating novel, functional protein backbones that have not
been observed in nature. However, there exists no principled methodological
framework for diffusion on SE(3), the space of orientation preserving rigid
motions in R3, that operates on frames and confers the group invariance. We
address these shortcomings by developing theoretical foundations of SE(3)
invariant diffusion models on multiple frames followed by a novel framework,
FrameDiff, for learning the SE(3) equivariant score over multiple frames. We
apply FrameDiff on monomer backbone generation and find it can generate
designable monomers up to 500 amino acids without relying on a pretrained
protein structure prediction network that has been integral to previous
methods. We find our samples are capable of generalizing beyond any known
protein structure.
Related papers
- Geometric Self-Supervised Pretraining on 3D Protein Structures using Subgraphs [25.93347924265175]
We propose a novel self-supervised method to pretrain 3D graph neural networks on 3D protein structures.
By considering subgraphs and their relationships to the global protein structure, the model can learn to reason about these hierarchical levels of organization.
arXiv Detail & Related papers (2024-06-20T09:34:31Z) - Protein Conformation Generation via Force-Guided SE(3) Diffusion Models [48.48934625235448]
Deep generative modeling techniques have been employed to generate novel protein conformations.
We propose a force-guided SE(3) diffusion model, ConfDiff, for protein conformation generation.
arXiv Detail & Related papers (2024-03-21T02:44:08Z) - Diffusion Language Models Are Versatile Protein Learners [80.51049288791717]
diffusion protein language model (DPLM) is a versatile protein language model that demonstrates strong generative and predictive capabilities for protein sequences.
We first pre-train scalable DPLMs from evolutionary-scale protein sequences within a generative self-supervised discrete diffusion probabilistic framework.
After pre-training, DPLM exhibits the ability to generate structurally plausible, novel, and diverse protein sequences for unconditional generation.
arXiv Detail & Related papers (2024-02-28T18:57:56Z) - A Latent Diffusion Model for Protein Structure Generation [50.74232632854264]
We propose a latent diffusion model that can reduce the complexity of protein modeling.
We show that our method can effectively generate novel protein backbone structures with high designability and efficiency.
arXiv Detail & Related papers (2023-05-06T19:10:19Z) - Generating Novel, Designable, and Diverse Protein Structures by
Equivariantly Diffusing Oriented Residue Clouds [0.0]
Structure-based protein design aims to find structures that are designable, novel, and diverse.
Generative models provide a compelling alternative, by implicitly learning the low-dimensional structure of complex data.
We develop Genie, a generative model of protein structures that performs discrete-time diffusion using a cloud of oriented reference frames in 3D space.
arXiv Detail & Related papers (2023-01-29T16:44:19Z) - Protein structure generation via folding diffusion [16.12124223972183]
We present a new diffusion-based generative model that designs protein backbone structures.
We generate new structures by denoising from a random, unfolded state towards a stable folded structure.
As a useful resource, we release the first open-source and trained models for protein structure diffusion.
arXiv Detail & Related papers (2022-09-30T17:35:53Z) - Independent SE(3)-Equivariant Models for End-to-End Rigid Protein
Docking [57.2037357017652]
We tackle rigid body protein-protein docking, i.e., computationally predicting the 3D structure of a protein-protein complex from the individual unbound structures.
We design a novel pairwise-independent SE(3)-equivariant graph matching network to predict the rotation and translation to place one of the proteins at the right docked position.
Our model, named EquiDock, approximates the binding pockets and predicts the docking poses using keypoint matching and alignment.
arXiv Detail & Related papers (2021-11-15T18:46:37Z) - G-VAE, a Geometric Convolutional VAE for ProteinStructure Generation [41.66010308405784]
We introduce a joint geometric-neural networks approach for comparing, deforming and generating 3D protein structures.
Our method is able to generate plausible structures, different from the structures in the training data.
arXiv Detail & Related papers (2021-06-22T16:52:48Z) - Functional Protein Structure Annotation Using a Deep Convolutional
Generative Adversarial Network [4.3871352596331255]
We introduce the use of a Deep Convolutional Generative Adversarial Network (DCGAN) to classify protein structures based on their functionality.
We train DCGAN on 3-dimensional (3D) decoy and native protein structures in order to generate and discriminate 3D protein structures.
arXiv Detail & Related papers (2021-04-18T22:18:52Z) - BERTology Meets Biology: Interpreting Attention in Protein Language
Models [124.8966298974842]
We demonstrate methods for analyzing protein Transformer models through the lens of attention.
We show that attention captures the folding structure of proteins, connecting amino acids that are far apart in the underlying sequence, but spatially close in the three-dimensional structure.
We also present a three-dimensional visualization of the interaction between attention and protein structure.
arXiv Detail & Related papers (2020-06-26T21:50:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.