Related papers: Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

Chemically Transferable Generative Backmapping of Coarse-Grained Proteins

URL: http://arxiv.org/abs/2303.01569v1
Date: Thu, 2 Mar 2023 20:51:57 GMT
Title: Chemically Transferable Generative Backmapping of Coarse-Grained Proteins
Authors: Soojung Yang and Rafael G\'omez-Bombarelli
Abstract summary: Coarse-graining (CG) accelerates simulations of protein dynamics by simulating sets of atoms as singular beads. Backmapping is the opposite operation of bringing lost atomistic details back from the CG representation. This work builds a fast, transferable, and reliable generative backmapping tool for CG protein representations.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Coarse-graining (CG) accelerates molecular simulations of protein dynamics by simulating sets of atoms as singular beads. Backmapping is the opposite operation of bringing lost atomistic details back from the CG representation. While machine learning (ML) has produced accurate and efficient CG simulations of proteins, fast and reliable backmapping remains a challenge. Rule-based methods produce poor all-atom geometries, needing computationally costly refinement through additional simulations. Recently proposed ML approaches outperform traditional baselines but are not transferable between proteins and sometimes generate unphysical atom placements with steric clashes and implausible torsion angles. This work addresses both issues to build a fast, transferable, and reliable generative backmapping tool for CG protein representations. We achieve generalization and reliability through a combined set of innovations: representation based on internal coordinates; an equivariant encoder/prior; a custom loss function that helps ensure local structure, global structure, and physical constraints; and expert curation of high-quality out-of-equilibrium protein data for training. Our results pave the way for out-of-the-box backmapping of coarse-grained simulations for arbitrary proteins.

Related papers

CryoGS: Gaussian Splatting for Cryo-EM Homogeneous Reconstruction [55.2480439325792]
cryogenic electron microscopy (cryo-EM) facilitates the determination of macromolecular structures at near-atomic resolution.<n>The core computational task in single-particle cryo-EM is to reconstruct the 3D electrostatic potential of a molecule.<n>We introduce cryoGS, a GMM-based method that integrates Gaussian splatting with the physics of cryo-EM image formation.
arXiv Detail & Related papers (2025-08-06T23:24:43Z)
An Iterative Framework for Generative Backmapping of Coarse Grained Proteins [0.6990493129893112]
We introduce a novel iterative framework by using conditional Variational Autoencoders and graph-based neural networks.<n>We outline the theory of iterative generative backmapping and demonstrate via numerical experiments the advantages of multistep schemes.<n>This multistep approach not only improves the accuracy of reconstructions but also makes the training process more computationally efficient for proteins with ultra-CG representations.
arXiv Detail & Related papers (2025-05-23T16:40:25Z)
GausSim: Foreseeing Reality by Gaussian Simulator for Elastic Objects [55.02281855589641]
GausSim is a novel neural network-based simulator designed to capture the dynamic behaviors of real-world elastic objects represented through Gaussian kernels. We leverage continuum mechanics and treat each kernel as a Center of Mass System (CMS) that represents continuous piece of matter. In addition, GausSim incorporates explicit physics constraints, such as mass and momentum conservation, ensuring interpretable results and robust, physically plausible simulations.
arXiv Detail & Related papers (2024-12-23T18:58:17Z)
Long-context Protein Language Model [76.95505296417866]
Self-supervised training of language models (LMs) has seen great success for protein sequences in learning meaningful representations and for generative drug design. Most protein LMs are based on the Transformer architecture trained on individual proteins with short context lengths. We propose LC-PLM based on an alternative protein LM architecture, BiMamba-S, built off selective structured state-space models. We also introduce its graph-contextual variant, LC-PLM-G, which contextualizes protein-protein interaction graphs for a second stage of training.
arXiv Detail & Related papers (2024-10-29T16:43:28Z)
The Latent Road to Atoms: Backmapping Coarse-grained Protein Structures with Latent Diffusion [19.85659309869674]
Latent Diffusion Backmapping (LDB) is a novel approach leveraging denoising diffusion within latent space to address these challenges. We evaluate LDB's state-of-the-art performance on three distinct protein datasets. Our results position LDB as a powerful and scalable approach for backmapping, effectively bridging the gap between CG simulations and atomic-level analyses in computational biology.
arXiv Detail & Related papers (2024-10-17T06:38:07Z)
Protein Conformation Generation via Force-Guided SE(3) Diffusion Models [48.48934625235448]
Deep generative modeling techniques have been employed to generate novel protein conformations. We propose a force-guided SE(3) diffusion model, ConfDiff, for protein conformation generation.
arXiv Detail & Related papers (2024-03-21T02:44:08Z)
Navigating protein landscapes with a machine-learned transferable coarse-grained model [29.252004942896875]
coarse-grained (CG) model with similar prediction performance has been a long-standing challenge. We develop a bottom-up CG force field with chemical transferability, which can be used for extrapolative molecular dynamics on new sequences. We demonstrate that the model successfully predicts folded structures, intermediates, metastable folded and unfolded basins, and the fluctuations of intrinsically disordered proteins.
arXiv Detail & Related papers (2023-10-27T17:10:23Z)
DiAMoNDBack: Diffusion-denoising Autoregressive Model for Non-Deterministic Backmapping of C{\alpha} Protein Traces [0.0]
DiAMoNDBack is an autoregressive denoising diffusion probability model for non-Deterministic Backmapping. We train DiAMoNDBack over 65k+ structures from Protein Data Bank (PDB) and validate it in applications to a hold-out PDB test set. We make DiAMoNDBack publicly available as a free and open source Python package.
arXiv Detail & Related papers (2023-07-23T23:05:08Z)
Top-down machine learning of coarse-grained protein force-fields [2.1485350418225244]
Our methodology involves simulating proteins with molecular dynamics and utilizing the resulting trajectories to train a neural network potential. Remarkably, this method requires only the native conformation of proteins, eliminating the need for labeled data. By applying Markov State Models, native-like conformations of the simulated proteins can be predicted from the coarse-grained simulations.
arXiv Detail & Related papers (2023-06-20T08:31:24Z)
State-specific protein-ligand complex structure prediction with a multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures. Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z)
Learning Geometrically Disentangled Representations of Protein Folding Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein. Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules. Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z)
Accurate Machine Learned Quantum-Mechanical Force Fields for Biomolecular Simulations [51.68332623405432]
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes. Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations. This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations.
arXiv Detail & Related papers (2022-05-17T13:08:28Z)
EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network. Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.