Unified Biomolecular Trajectory Generation via Pretrained Variational Bridge
- URL: http://arxiv.org/abs/2602.07588v1
- Date: Sat, 07 Feb 2026 15:32:37 GMT
- Title: Unified Biomolecular Trajectory Generation via Pretrained Variational Bridge
- Authors: Ziyang Yu, Wenbing Huang, Yang Liu,
- Abstract summary: We present the Pretrained Variational Bridge (PVB), which maps the initial structure into a noised latent space and transports it toward stage-specific targets.<n>This unifies training on both single-structure and paired trajectory data, enabling consistent use of cross-domain structural knowledge.<n>Experiments on proteins and protein-ligand complexes demonstrate that PVB faithfully reproduces thermodynamic and kinetic observables from MD while delivering stable and efficient generative dynamics.
- Score: 19.279397111680115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Molecular Dynamics (MD) simulations provide a fundamental tool for characterizing molecular behavior at full atomic resolution, but their applicability is severely constrained by the computational cost. To address this, a surge of deep generative models has recently emerged to learn dynamics at coarsened timesteps for efficient trajectory generation, yet they either generalize poorly across systems or, due to limited molecular diversity of trajectory data, fail to fully exploit structural information to improve generative fidelity. Here, we present the Pretrained Variational Bridge (PVB) in an encoder-decoder fashion, which maps the initial structure into a noised latent space and transports it toward stage-specific targets through augmented bridge matching. This unifies training on both single-structure and paired trajectory data, enabling consistent use of cross-domain structural knowledge across training stages. Moreover, for protein-ligand complexes, we further introduce a reinforcement learning-based optimization via adjoint matching that speeds progression toward the holo state, which supports efficient post-optimization of docking poses. Experiments on proteins and protein-ligand complexes demonstrate that PVB faithfully reproduces thermodynamic and kinetic observables from MD while delivering stable and efficient generative dynamics.
Related papers
- Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles [74.32932832937618]
We introduce $textbfRigidSSL$ ($textitRigidity-Aware Self-Supervised Learning$), a geometric pretraining framework.<n>Phase I (RigidSSL-Perturb) learns geometric priors from 432K structures from the AlphaFold Protein Structure Database with simulated perturbations.<n>Phase II (RigidSSL-MD) refines these representations on 1.3K molecular dynamics trajectories to capture physically realistic transitions.
arXiv Detail & Related papers (2026-03-02T21:32:30Z) - SaDiT: Efficient Protein Backbone Design via Latent Structural Tokenization and Diffusion Transformers [50.18388227899971]
We present SaDiT, a novel framework that accelerates protein backbone generation by integrating SaProt Tokenization with a Diffusion Transformer (DiT) architecture.<n>Experiments demonstrate that SaDiT outperforms state-of-the-art models, including RFDiffusion and Proteina, in both computational speed and structural viability.
arXiv Detail & Related papers (2026-02-06T13:50:13Z) - Scalable Spatio-Temporal SE(3) Diffusion for Long-Horizon Protein Dynamics [51.85385061275941]
Molecular dynamics (MD) simulations remain the gold standard for studying protein dynamics.<n>Recent generative models have shown promise in accelerating simulations, yet they struggle with long-horizon generation.<n>We present STAR-MD, a scalable diffusion model that generates physically plausible protein trajectories over micro-scale timescales.
arXiv Detail & Related papers (2026-02-02T14:13:28Z) - Efficient Regression-Based Training of Normalizing Flows for Boltzmann Generators [85.25962679349551]
Boltzmann Generators (BGs) offer efficient sampling and likelihoods, but their training via maximum likelihood is often unstable and computationally challenging.<n>We propose Regression Training of Normalizing Flows (RegFlow), a novel and scalable-based training objective that bypasses the numerical instability and computational challenge of conventional maximum likelihood training.
arXiv Detail & Related papers (2025-06-01T20:32:27Z) - Aligning Protein Conformation Ensemble Generation with Physical Feedback [29.730515284798397]
Energy-based Alignment (EBA) is a method that aligns generative models with feedback from physical models.<n>EBA achieves state-of-the-art performance in generating high-quality protein ensembles.
arXiv Detail & Related papers (2025-05-30T04:33:39Z) - Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression [45.49904590474368]
ConfRover is an autoregressive model that simultaneously learns protein conformation and dynamics from MD trajectories.<n>It supports both time-dependent and time-independent sampling.<n>Experiments on ATLAS, a large-scale protein MD dataset, demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2025-05-23T05:00:15Z) - UniGenX: a unified generative foundation model that couples sequence, structure and function to accelerate scientific design across proteins, molecules and materials [62.72989417755985]
We present UniGenX, a unified generative model for function in natural systems.<n>UniGenX represents heterogeneous inputs as a mixed stream of symbolic and numeric tokens.<n>It achieves state-of-the-art or competitive performance for the function-aware generation across domains.
arXiv Detail & Related papers (2025-03-09T16:43:07Z) - Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations.<n>Deep generative models have shown promise in generating protein conformations as a more efficient alternative.<n>We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z) - The Latent Road to Atoms: Backmapping Coarse-grained Protein Structures with Latent Diffusion [19.85659309869674]
Latent Diffusion Backmapping (LDB) is a novel approach leveraging denoising diffusion within latent space to address these challenges.
We evaluate LDB's state-of-the-art performance on three distinct protein datasets.
Our results position LDB as a powerful and scalable approach for backmapping, effectively bridging the gap between CG simulations and atomic-level analyses in computational biology.
arXiv Detail & Related papers (2024-10-17T06:38:07Z) - EquiJump: Protein Dynamics Simulation via SO(3)-Equivariant Stochastic Interpolants [13.493198442811865]
We introduce EquiJump, a transferable SO(3)-equivariant model that bridges all-atom protein dynamics simulation time steps directly.<n>Our approach achieves diverse sampling methods and is benchmarked against existing models on trajectory data of fast folding proteins.
arXiv Detail & Related papers (2024-10-12T23:22:49Z) - Force-Guided Bridge Matching for Full-Atom Time-Coarsened Dynamics of Peptides [17.559471937824767]
We propose a conditional generative model called Force-guided Bridge Matching (FBM)<n>FBM learns full-atom time-coarsened dynamics and targets the Boltzmann-constrained distribution.<n> Experiments on two datasets consisting of peptides verify our superiority in terms of comprehensive metrics.
arXiv Detail & Related papers (2024-08-27T15:07:27Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.