An Iterative Framework for Generative Backmapping of Coarse Grained Proteins
- URL: http://arxiv.org/abs/2505.18082v1
- Date: Fri, 23 May 2025 16:40:25 GMT
- Title: An Iterative Framework for Generative Backmapping of Coarse Grained Proteins
- Authors: Georgios Kementzidis, Erin Wong, John Nicholson, Ruichen Xu, Yuefan Deng,
- Abstract summary: We introduce a novel iterative framework by using conditional Variational Autoencoders and graph-based neural networks.<n>We outline the theory of iterative generative backmapping and demonstrate via numerical experiments the advantages of multistep schemes.<n>This multistep approach not only improves the accuracy of reconstructions but also makes the training process more computationally efficient for proteins with ultra-CG representations.
- Score: 0.6990493129893112
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The techniques of data-driven backmapping from coarse-grained (CG) to fine-grained (FG) representation often struggle with accuracy, unstable training, and physical realism, especially when applied to complex systems such as proteins. In this work, we introduce a novel iterative framework by using conditional Variational Autoencoders and graph-based neural networks, specifically designed to tackle the challenges associated with such large-scale biomolecules. Our method enables stepwise refinement from CG beads to full atomistic details. We outline the theory of iterative generative backmapping and demonstrate via numerical experiments the advantages of multistep schemes by applying them to proteins of vastly different structures with very coarse representations. This multistep approach not only improves the accuracy of reconstructions but also makes the training process more computationally efficient for proteins with ultra-CG representations.
Related papers
- Learning the Language of Protein Structure [8.364087723533537]
We introduce an approach using a vector-quantized autoencoder that effectively tokenizes protein structures into discrete representations.<n>To demonstrate the efficacy of our learned representations, we show that a simple GPT model trained on our codebooks can generate novel, diverse, and designable protein structures.
arXiv Detail & Related papers (2024-05-24T16:03:47Z) - HEroBM: a deep equivariant graph neural network for universal backmapping from coarse-grained to all-atom representations [0.0]
coarse-grained (CG) techniques have emerged as invaluable tools to sample large-scale systems.
They sacrifice atomistic details that might hold significant relevance in deciphering the investigated process.
A recommended approach is to identify key CG conformations and process them using backmapping methods, which retrieve atomistic coordinates.
We introduce HEroBM, a dynamic and scalable method that employs deep equivariant graph neural networks and a hierarchical approach to achieve high-resolution backmapping.
arXiv Detail & Related papers (2024-04-25T13:54:31Z) - Self-STORM: Deep Unrolled Self-Supervised Learning for Super-Resolution Microscopy [55.2480439325792]
We introduce deep unrolled self-supervised learning, which alleviates the need for such data by training a sequence-specific, model-based autoencoder.
Our proposed method exceeds the performance of its supervised counterparts.
arXiv Detail & Related papers (2024-03-25T17:40:32Z) - Navigating protein landscapes with a machine-learned transferable
coarse-grained model [29.252004942896875]
coarse-grained (CG) model with similar prediction performance has been a long-standing challenge.
We develop a bottom-up CG force field with chemical transferability, which can be used for extrapolative molecular dynamics on new sequences.
We demonstrate that the model successfully predicts folded structures, intermediates, metastable folded and unfolded basins, and the fluctuations of intrinsically disordered proteins.
arXiv Detail & Related papers (2023-10-27T17:10:23Z) - Backdiff: a diffusion model for generalized transferable protein
backmapping [9.815461018844523]
BackDiff is a new generative model designed to achieve generalization and reliability in the protein backmapping problem.
Our method facilitates end-to-end training and allows efficient sampling across different proteins and diverse CG models without the need for retraining.
arXiv Detail & Related papers (2023-10-03T03:32:07Z) - Fast and Functional Structured Data Generators Rooted in Out-of-Equilibrium Physics [44.97217246897902]
We address the challenge of using energy-based models to produce high-quality, label-specific data in structured datasets.
Traditional training methods encounter difficulties due to inefficient Markov chain Monte Carlo mixing.
We use a novel training algorithm that exploits non-equilibrium effects.
arXiv Detail & Related papers (2023-07-13T15:08:44Z) - Chemically Transferable Generative Backmapping of Coarse-Grained
Proteins [0.0]
Coarse-graining (CG) accelerates simulations of protein dynamics by simulating sets of atoms as singular beads.
Backmapping is the opposite operation of bringing lost atomistic details back from the CG representation.
This work builds a fast, transferable, and reliable generative backmapping tool for CG protein representations.
arXiv Detail & Related papers (2023-03-02T20:51:57Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - A parameter refinement method for Ptychography based on Deep Learning
concepts [55.41644538483948]
coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability.
A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction.
We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.
arXiv Detail & Related papers (2021-05-18T10:15:17Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.