Related papers: Ophiuchus: Scalable Modeling of Protein Structures through Hierarchical Coarse-graining SO(3)-Equivariant Autoencoders

Ophiuchus: Scalable Modeling of Protein Structures through Hierarchical Coarse-graining SO(3)-Equivariant Autoencoders

URL: http://arxiv.org/abs/2310.02508v2
Date: Wed, 27 Dec 2023 00:52:52 GMT
Title: Ophiuchus: Scalable Modeling of Protein Structures through Hierarchical Coarse-graining SO(3)-Equivariant Autoencoders
Authors: Allan dos Santos Costa and Ilan Mitnikov and Mario Geiger and Manvitha Ponnapati and Tess Smidt and Joseph Jacobson
Abstract summary: Three-dimensional native states of natural proteins display recurring and hierarchical patterns. Traditional graph-based modeling of protein structures is often limited to operate within a single fine-grained resolution. We introduce Ophiuchus, an SO(3)-equivariant coarse-graining model that efficiently operates on all-atom protein structures.
Score: 1.8835495377767553
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Three-dimensional native states of natural proteins display recurring and hierarchical patterns. Yet, traditional graph-based modeling of protein structures is often limited to operate within a single fine-grained resolution, and lacks hourglass neural architectures to learn those high-level building blocks. We narrow this gap by introducing Ophiuchus, an SO(3)-equivariant coarse-graining model that efficiently operates on all-atom protein structures. Our model departs from current approaches that employ graph modeling, instead focusing on local convolutional coarsening to model sequence-motif interactions with efficient time complexity in protein length. We measure the reconstruction capabilities of Ophiuchus across different compression rates, and compare it to existing models. We examine the learned latent space and demonstrate its utility through conformational interpolation. Finally, we leverage denoising diffusion probabilistic models (DDPM) in the latent space to efficiently sample protein structures. Our experiments demonstrate Ophiuchus to be a scalable basis for efficient protein modeling and generation.

Related papers

Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression [45.49904590474368]
ConfRover is an autoregressive model that simultaneously learns protein conformation and dynamics from MD trajectories.<n>It supports both time-dependent and time-independent sampling.<n>Experiments on ATLAS, a large-scale protein MD dataset, demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2025-05-23T05:00:15Z)
Learning conformational ensembles of proteins based on backbone geometry [1.1874952582465603]
We propose a flow matching model for sampling protein conformations based solely on backbone geometry. The resulting model is orders of magnitudes faster than current state-of-the-art approaches at comparable accuracy and can be trained from scratch in a few GPU days.
arXiv Detail & Related papers (2025-02-19T17:16:27Z)
Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations. Deep generative models have shown promise in generating protein conformations as a more efficient alternative. We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z)
DPLM-2: A Multimodal Diffusion Protein Language Model [75.98083311705182]
We introduce DPLM-2, a multimodal protein foundation model that extends discrete diffusion protein language model (DPLM) to accommodate both sequences and structures. DPLM-2 learns the joint distribution of sequence and structure, as well as their marginals and conditionals. Empirical evaluation shows that DPLM-2 can simultaneously generate highly compatible amino acid sequences and their corresponding 3D structures.
arXiv Detail & Related papers (2024-10-17T17:20:24Z)
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design [56.957070405026194]
We propose an algorithm that enables direct backpropagation of rewards through entire trajectories generated by diffusion models. DRAKES can generate sequences that are both natural-like and yield high rewards.
arXiv Detail & Related papers (2024-10-17T15:10:13Z)
Accelerating Inference in Molecular Diffusion Models with Latent Representations of Protein Structure [0.0]
Diffusion generative models operate directly on 3D molecular structures. We present a novel GNN-based architecture for learning latent representations of molecular structure. Our model achieves comparable performance to one with an all-atom protein representation while exhibiting a 3-fold reduction in inference time.
arXiv Detail & Related papers (2023-11-22T15:32:31Z)
Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling. We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z)
A Latent Diffusion Model for Protein Structure Generation [50.74232632854264]
We propose a latent diffusion model that can reduce the complexity of protein modeling. We show that our method can effectively generate novel protein backbone structures with high designability and efficiency.
arXiv Detail & Related papers (2023-05-06T19:10:19Z)
EigenFold: Generative Protein Structure Prediction with Diffusion Models [10.24107243529341]
EigenFold is a diffusion generative modeling framework for sampling a distribution of structures from a given protein sequence. On recent CAMEO targets, EigenFold achieves a median TMScore of 0.84, while providing a more comprehensive picture of model uncertainty.
arXiv Detail & Related papers (2023-04-05T02:46:13Z)
Internal-Coordinate Density Modelling of Protein Structure: Covariance Matters [9.49959422062959]
We present a new strategy for modelling protein densities in internal coordinates, which uses constraints in 3D space to induce covariance structure between the internal degrees of freedom. We demonstrate that our approach makes it possible to scale density models of internal coordinates to full protein backbones in two settings.
arXiv Detail & Related papers (2023-02-27T12:18:19Z)
Learning Geometrically Disentangled Representations of Protein Folding Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein. Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules. Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z)
G-VAE, a Geometric Convolutional VAE for ProteinStructure Generation [41.66010308405784]
We introduce a joint geometric-neural networks approach for comparing, deforming and generating 3D protein structures. Our method is able to generate plausible structures, different from the structures in the training data.
arXiv Detail & Related papers (2021-06-22T16:52:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.