HemePLM-Diffuse: A Scalable Generative Framework for Protein-Ligand Dynamics in Large Biomolecular System
- URL: http://arxiv.org/abs/2508.16587v1
- Date: Thu, 07 Aug 2025 17:29:52 GMT
- Title: HemePLM-Diffuse: A Scalable Generative Framework for Protein-Ligand Dynamics in Large Biomolecular System
- Authors: Rakesh Thakur, Riya Gupta,
- Abstract summary: We introduce HemeM-Diffuse, an innovative generative transformer model that is designed for accurate simulation of protein-ligand trajectories.<n>We show its capabilities using the 3CQV HEME system, showing enhanced accuracy and scalability compared to leading models such as TorchMD-Net, MDGEN, and Uni-Mol.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Comprehending the long-timescale dynamics of protein-ligand complexes is very important for drug discovery and structural biology, but it continues to be computationally challenging for large biomolecular systems. We introduce HemePLM-Diffuse, an innovative generative transformer model that is designed for accurate simulation of protein-ligand trajectories, inpaints the missing ligand fragments, and sample transition paths in systems with more than 10,000 atoms. HemePLM-Diffuse has features of SE(3)-Invariant tokenization approach for proteins and ligands, that utilizes time-aware cross-attentional diffusion to effectively capture atomic motion. We also demonstrate its capabilities using the 3CQV HEME system, showing enhanced accuracy and scalability compared to leading models such as TorchMD-Net, MDGEN, and Uni-Mol.
Related papers
- UBio-MolFM: A Universal Molecular Foundation Model for Bio-Systems [12.633470669776317]
We present UBio-MolFM, a universal foundation model framework designed to bridge the gap between quantum-mechanical (QM) accuracy and biological scale.<n>UBio-MolFM achieves ab initio-level fidelity on large, out-of-distribution biomolecular systems (up to 1,500 atoms) and realistic observables.
arXiv Detail & Related papers (2026-02-13T04:38:28Z) - SaDiT: Efficient Protein Backbone Design via Latent Structural Tokenization and Diffusion Transformers [50.18388227899971]
We present SaDiT, a novel framework that accelerates protein backbone generation by integrating SaProt Tokenization with a Diffusion Transformer (DiT) architecture.<n>Experiments demonstrate that SaDiT outperforms state-of-the-art models, including RFDiffusion and Proteina, in both computational speed and structural viability.
arXiv Detail & Related papers (2026-02-06T13:50:13Z) - Scalable Machine Learning Force Fields for Macromolecular Systems Through Long-Range Aware Message Passing [18.50744268453995]
Machine learning force fields (MLFFs) have revolutionized molecular simulations by providing quantum mechanical accuracy at the speed of molecular mechanical computations.<n>However, a fundamental reliance of these models on fixed-cutoff architectures limits their applicability to macromolecular systems where long-range interactions dominate.<n>We demonstrate that this locality constraint causes force prediction errors to scale monotonically with system size, revealing a critical architectural bottleneck.<n>We introduce E2Former-LSR, an equivariant transformer that explicitly integrates long-range attention blocks. E2Former-LSR achieves stable error scaling, superior fidelity in capturing non-covalent decay, and
arXiv Detail & Related papers (2026-01-07T10:12:34Z) - ProteinAE: Protein Diffusion Autoencoders for Structure Encoding [64.77182442408254]
We introduce ProteinAE, a novel and streamlined protein diffusion autoencoder.<n>ProteinAE directly maps protein backbone coordinates from E(3) into a continuous, compact latent space.<n>We demonstrate that ProteinAE achieves state-of-the-art reconstruction quality, outperforming existing autoencoders.
arXiv Detail & Related papers (2025-10-12T14:30:32Z) - A Scalable and Quantum-Accurate Foundation Model for Biomolecular Force Field via Linearly Tensorized Quadrangle Attention [6.749581549330875]
We present LiTEN, a novel AI-based force field framework for atomistic biomolecular simulations.<n>Building on LiTEN, LiTEN-FF is a robust AIFF foundation model, pre-trained on the nablaDFT dataset for broad chemical generalization.<n>LiTEN achieves state-of-the-art (SOTA) performance across most evaluation subsets of rMD17, MD22, and Chignolin, outperforming leading models such as MACE, NequIP, and EquiFormer.
arXiv Detail & Related papers (2025-07-01T15:52:39Z) - Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling [77.26556208024633]
3D molecule generation is crucial for drug discovery and material science.<n>Existing approaches typically maintain separate latent spaces for invariant and equivariant modalities.<n>We propose textbfUAE-3D, a multi-modal VAE that compresses 3D molecules into latent sequences from a unified latent space.
arXiv Detail & Related papers (2025-03-19T08:56:13Z) - Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations.<n>Deep generative models have shown promise in generating protein conformations as a more efficient alternative.<n>We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z) - Protein Conformation Generation via Force-Guided SE(3) Diffusion Models [48.48934625235448]
Deep generative modeling techniques have been employed to generate novel protein conformations.
We propose a force-guided SE(3) diffusion model, ConfDiff, for protein conformation generation.
arXiv Detail & Related papers (2024-03-21T02:44:08Z) - From Peptides to Nanostructures: A Euclidean Transformer for Fast and
Stable Machine Learned Force Fields [5.013279299982324]
We propose a transformer architecture called SO3krates that combines sparse equivariant representations with a self-attention mechanism.
SO3krates achieves a unique combination of accuracy, stability, and speed that enables insightful analysis of quantum properties of matter on extended time and system size scales.
arXiv Detail & Related papers (2023-09-21T09:22:05Z) - Implicit Transfer Operator Learning: Multiple Time-Resolution Surrogates
for Molecular Dynamics [8.35780131268962]
We present Implict Transfer Operator (ITO) Learning, a framework to learn surrogates of the simulation process with multiple time-resolutions.
We also present a coarse-grained CG-SE3-ITO model which can quantitatively model all-atom molecular dynamics.
arXiv Detail & Related papers (2023-05-29T12:19:41Z) - Protein-Ligand Complex Generator & Drug Screening via Tiered Tensor
Transform [18.509174420141832]
We develop an algorithm to generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening.
The 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.
arXiv Detail & Related papers (2023-01-03T07:33:20Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Accurate Machine Learned Quantum-Mechanical Force Fields for
Biomolecular Simulations [51.68332623405432]
Molecular dynamics (MD) simulations allow atomistic insights into chemical and biological processes.
Recently, machine learned force fields (MLFFs) emerged as an alternative means to execute MD simulations.
This work proposes a general approach to constructing accurate MLFFs for large-scale molecular simulations.
arXiv Detail & Related papers (2022-05-17T13:08:28Z) - Accelerated Simulations of Molecular Systems through Learning of their
Effective Dynamics [4.276697874428501]
We present a novel framework to advance simulation by up to three orders of magnitude.
LED learns the effective dynamics of molecular systems.
We demonstrate the effectiveness of LED in the M"ueller-Brown potential, the Trp Cage protein, and the alanine dipeptide.
arXiv Detail & Related papers (2021-02-17T15:15:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.