Aligning Protein Conformation Ensemble Generation with Physical Feedback
- URL: http://arxiv.org/abs/2505.24203v1
- Date: Fri, 30 May 2025 04:33:39 GMT
- Title: Aligning Protein Conformation Ensemble Generation with Physical Feedback
- Authors: Jiarui Lu, Xiaoyin Chen, Stephen Zhewen Lu, Aurélie Lozano, Vijil Chenthamarakshan, Payel Das, Jian Tang,
- Abstract summary: Energy-based Alignment (EBA) is a method that aligns generative models with feedback from physical models.<n>EBA achieves state-of-the-art performance in generating high-quality protein ensembles.
- Score: 29.730515284798397
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Protein dynamics play a crucial role in protein biological functions and properties, and their traditional study typically relies on time-consuming molecular dynamics (MD) simulations conducted in silico. Recent advances in generative modeling, particularly denoising diffusion models, have enabled efficient accurate protein structure prediction and conformation sampling by learning distributions over crystallographic structures. However, effectively integrating physical supervision into these data-driven approaches remains challenging, as standard energy-based objectives often lead to intractable optimization. In this paper, we introduce Energy-based Alignment (EBA), a method that aligns generative models with feedback from physical models, efficiently calibrating them to appropriately balance conformational states based on their energy differences. Experimental results on the MD ensemble benchmark demonstrate that EBA achieves state-of-the-art performance in generating high-quality protein ensembles. By improving the physical plausibility of generated structures, our approach enhances model predictions and holds promise for applications in structural biology and drug discovery.
Related papers
- UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion [61.690978792873196]
Existing approaches rely on either autoregressive sequence models or diffusion models.<n>We propose UniGenX, a unified framework that combines autoregressive next-token prediction with conditional diffusion models.<n>We validate the effectiveness of UniGenX on material and small molecule generation tasks.
arXiv Detail & Related papers (2025-03-09T16:43:07Z) - Learning conformational ensembles of proteins based on backbone geometry [1.1874952582465603]
We propose a flow matching model for sampling protein conformations based solely on backbone geometry.<n>The resulting model is orders of magnitudes faster than current state-of-the-art approaches at comparable accuracy and can be trained from scratch in a few GPU days.
arXiv Detail & Related papers (2025-02-19T17:16:27Z) - SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation [97.99658944212675]
We introduce a novel pre-training strategy for protein foundation models.
It emphasizes the interactions among amino acid residues to enhance the extraction of both short-range and long-range co-evolutionary features.
Trained on a large-scale protein sequence dataset, our model demonstrates superior generalization ability.
arXiv Detail & Related papers (2024-10-31T15:22:03Z) - Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations.<n>Deep generative models have shown promise in generating protein conformations as a more efficient alternative.<n>We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z) - AlphaFolding: 4D Diffusion for Dynamic Protein Structure Prediction with Reference and Motion Guidance [18.90451943620277]
This study introduces an innovative 4D diffusion model incorporating molecular dynamics (MD) simulation data to learn dynamic protein structures.<n>Our model exhibits high accuracy in predicting dynamic 3D structures of proteins containing up to 256 amino acids over 32 time steps.
arXiv Detail & Related papers (2024-08-22T14:12:50Z) - Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures [15.819618708991598]
We introduce a large-scale dataset, Dynamic PDB, encompassing approximately 12.6K proteins.
We provide a comprehensive suite of physical properties, including atomic velocities and forces, potential and kinetic energies, and the temperature of the simulation environment.
For benchmarking purposes, we evaluate state-of-the-art methods on the proposed dataset for the task of trajectory prediction.
arXiv Detail & Related papers (2024-08-22T14:06:01Z) - Protein Conformation Generation via Force-Guided SE(3) Diffusion Models [48.48934625235448]
Deep generative modeling techniques have been employed to generate novel protein conformations.
We propose a force-guided SE(3) diffusion model, ConfDiff, for protein conformation generation.
arXiv Detail & Related papers (2024-03-21T02:44:08Z) - Physics-informed generative model for drug-like molecule conformers [0.0]
We present a diffusion-based, generative model for conformer generation.
Our model is focused on the reproduction of bonded structure and is constructed from the associated terms traditionally found in classical force fields.
Deep learning is used to infer atom typing and geometric parameters from a training set.
arXiv Detail & Related papers (2024-02-29T17:11:08Z) - Str2Str: A Score-based Framework for Zero-shot Protein Conformation
Sampling [23.74897713386661]
The dynamic nature of proteins is crucial for determining their biological functions and properties.
Existing learning-based approaches perform direct sampling yet heavily rely on target-specific simulation data for training.
We propose Str2Str, a novel structure-to-structure translation framework capable of zero-shot conformation sampling.
arXiv Detail & Related papers (2023-06-05T15:19:06Z) - Conditional Generative Models for Simulation of EMG During Naturalistic
Movements [45.698312905115955]
We present a conditional generative neural network trained adversarially to generate motor unit activation potential waveforms.
We demonstrate the ability of such a model to predictively interpolate between a much smaller number of numerical model's outputs with a high accuracy.
arXiv Detail & Related papers (2022-11-03T14:49:02Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - Energy-based models for atomic-resolution protein conformations [88.68597850243138]
We propose an energy-based model (EBM) of protein conformations that operates at atomic scale.
The model is trained solely on crystallized protein data.
An investigation of the model's outputs and hidden representations finds that it captures physicochemical properties relevant to protein energy.
arXiv Detail & Related papers (2020-04-27T20:45:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.