BioMD: All-atom Generative Model for Biomolecular Dynamics Simulation
- URL: http://arxiv.org/abs/2509.02642v1
- Date: Tue, 02 Sep 2025 07:12:50 GMT
- Title: BioMD: All-atom Generative Model for Biomolecular Dynamics Simulation
- Authors: Bin Feng, Jiying Zhang, Xinni Zhang, Zijing Liu, Yu Li,
- Abstract summary: We introduce BioMD, the first all-atom generative model to simulate long-timescale protein-ligand dynamics.<n>For both datasets, BioMD generates highly realistic conformations, showing high physical plausibility and low reconstruction errors.<n>These results establish BioMD as a tool for simulating complex biomolecular processes, offering broad applicability for computational chemistry and drug discovery.
- Score: 14.323362384234441
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Molecular dynamics (MD) simulations are essential tools in computational chemistry and drug discovery, offering crucial insights into dynamic molecular behavior. However, their utility is significantly limited by substantial computational costs, which severely restrict accessible timescales for many biologically relevant processes. Despite the encouraging performance of existing machine learning (ML) methods, they struggle to generate extended biomolecular system trajectories, primarily due to the lack of MD datasets and the large computational demands of modeling long historical trajectories. Here, we introduce BioMD, the first all-atom generative model to simulate long-timescale protein-ligand dynamics using a hierarchical framework of forecasting and interpolation. We demonstrate the effectiveness and versatility of BioMD on the DD-13M (ligand unbinding) and MISATO datasets. For both datasets, BioMD generates highly realistic conformations, showing high physical plausibility and low reconstruction errors. Besides, BioMD successfully generates ligand unbinding paths for 97.1% of the protein-ligand systems within ten attempts, demonstrating its ability to explore critical unbinding pathways. Collectively, these results establish BioMD as a tool for simulating complex biomolecular processes, offering broad applicability for computational chemistry and drug discovery.
Related papers
- Protein Language Model Embeddings Improve Generalization of Implicit Transfer Operators [10.025462072265706]
We show that incorporating auxiliary sources of information can improve the data efficiency and generalization of implicit transfer operators for molecular dynamics.<n>Our approach, PLaTITO, achieves state-of-the-art performance on equilibrium sampling benchmarks for out-of-distribution protein systems.
arXiv Detail & Related papers (2026-02-11T09:26:12Z) - Learning Cell-Aware Hierarchical Multi-Modal Representations for Robust Molecular Modeling [74.25438319700929]
We propose CHMR (Cell-aware Hierarchical Multi-modal Representations), a robust framework that models local-global dependencies between molecules and cellular responses.<n> evaluated on nine public benchmarks spanning 728 tasks, CHMR outperforms state-of-the-art baselines.<n>Results demonstrate the advantage of hierarchy-aware, multimodal learning for reliable and biologically grounded molecular representations.
arXiv Detail & Related papers (2025-11-26T07:15:00Z) - A Scalable and Quantum-Accurate Foundation Model for Biomolecular Force Field via Linearly Tensorized Quadrangle Attention [6.749581549330875]
We present LiTEN, a novel AI-based force field framework for atomistic biomolecular simulations.<n>Building on LiTEN, LiTEN-FF is a robust AIFF foundation model, pre-trained on the nablaDFT dataset for broad chemical generalization.<n>LiTEN achieves state-of-the-art (SOTA) performance across most evaluation subsets of rMD17, MD22, and Chignolin, outperforming leading models such as MACE, NequIP, and EquiFormer.
arXiv Detail & Related papers (2025-07-01T15:52:39Z) - Enhanced Sampling, Public Dataset and Generative Model for Drug-Protein Dissociation Dynamics [10.80659641278556]
Drug-protein binding and dissociation dynamics are fundamental to understanding molecular interactions in biological systems.<n>We propose a novel research paradigm that combines molecular dynamics simulations, enhanced sampling, and AI generative models to address this issue.<n>Our ongoing efforts focus on expanding this methodology to encompass a broader spectrum of drug-protein complexes and exploring novel applications in pathway prediction.
arXiv Detail & Related papers (2025-04-25T14:10:06Z) - A Multi-Grained Symmetric Differential Equation Model for Learning Protein-Ligand Binding Dynamics [73.35846234413611]
In drug discovery, molecular dynamics (MD) simulation provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites.
We propose NeuralMD, the first machine learning (ML) surrogate that can facilitate numerical MD and provide accurate simulations in protein-ligand binding dynamics.
We demonstrate the efficiency and effectiveness of NeuralMD, achieving over 1K$times$ speedup compared to standard numerical MD simulations.
arXiv Detail & Related papers (2024-01-26T09:35:17Z) - ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab [67.24684071577211]
The challenge of replicating research results has posed a significant impediment to the field of molecular biology.
We first curate a comprehensive multimodal dataset, named ProBio, as an initial step towards this objective.
Next, we devise two challenging benchmarks, transparent solution tracking and multimodal action recognition, to emphasize the unique characteristics and difficulties associated with activity understanding in BioLab settings.
arXiv Detail & Related papers (2023-11-01T14:44:01Z) - Str2Str: A Score-based Framework for Zero-shot Protein Conformation
Sampling [23.74897713386661]
The dynamic nature of proteins is crucial for determining their biological functions and properties.
Existing learning-based approaches perform direct sampling yet heavily rely on target-specific simulation data for training.
We propose Str2Str, a novel structure-to-structure translation framework capable of zero-shot conformation sampling.
arXiv Detail & Related papers (2023-06-05T15:19:06Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Conditional Generative Models for Simulation of EMG During Naturalistic
Movements [45.698312905115955]
We present a conditional generative neural network trained adversarially to generate motor unit activation potential waveforms.
We demonstrate the ability of such a model to predictively interpolate between a much smaller number of numerical model's outputs with a high accuracy.
arXiv Detail & Related papers (2022-11-03T14:49:02Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Accurate Machine Learned Quantum-Mechanical Force Fields for
Biomolecular Simulations [51.68332623405432]
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes.
Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations.
This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations.
arXiv Detail & Related papers (2022-05-17T13:08:28Z) - Accelerated Simulations of Molecular Systems through Learning of their
Effective Dynamics [4.276697874428501]
We present a novel framework to advance simulation by up to three orders of magnitude.
LED learns the effective dynamics of molecular systems.
We demonstrate the effectiveness of LED in the M"ueller-Brown potential, the Trp Cage protein, and the alanine dipeptide.
arXiv Detail & Related papers (2021-02-17T15:15:37Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.