Torsion-Space Diffusion for Protein Backbone Generation with Geometric Refinement
- URL: http://arxiv.org/abs/2511.19184v1
- Date: Mon, 24 Nov 2025 14:51:29 GMT
- Title: Torsion-Space Diffusion for Protein Backbone Generation with Geometric Refinement
- Authors: Lakshaditya Singh, Adwait Shelke, Divyansh Agrawal,
- Abstract summary: New model generates protein backbones by denoising torsion angles, ensuring perfect local geometry by construction.<n>Experiments on standard PDB proteins demonstrate 100% bond-length accuracy and significantly improved structural compactness.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Designing new protein structures is fundamental to computational biology, enabling advances in therapeutic molecule discovery and enzyme engineering. Existing diffusion-based generative models typically operate in Cartesian coordinate space, where adding noise disrupts strict geometric constraints such as fixed bond lengths and angles, often producing physically invalid structures. To address this limitation, we propose a Torsion-Space Diffusion Model that generates protein backbones by denoising torsion angles, ensuring perfect local geometry by construction. A differentiable forward-kinematics module reconstructs 3D coordinates with fixed 3.8 Angstrom backbone bond lengths while a constrained post-processing refinement optimizes global compactness via Radius of Gyration (Rg) correction, without violating bond constraints. Experiments on standard PDB proteins demonstrate 100% bond-length accuracy and significantly improved structural compactness, reducing Rg error from 70% to 18.6% compared to Cartesian diffusion baselines. Overall, this hybrid torsion-diffusion plus geometric-refinement framework generates physically valid and compact protein backbones, providing a promising path toward full-atom protein generation.
Related papers
- Rigidity-Aware Geometric Pretraining for Protein Design and Conformational Ensembles [74.32932832937618]
We introduce $textbfRigidSSL$ ($textitRigidity-Aware Self-Supervised Learning$), a geometric pretraining framework.<n>Phase I (RigidSSL-Perturb) learns geometric priors from 432K structures from the AlphaFold Protein Structure Database with simulated perturbations.<n>Phase II (RigidSSL-MD) refines these representations on 1.3K molecular dynamics trajectories to capture physically realistic transitions.
arXiv Detail & Related papers (2026-03-02T21:32:30Z) - SaDiT: Efficient Protein Backbone Design via Latent Structural Tokenization and Diffusion Transformers [50.18388227899971]
We present SaDiT, a novel framework that accelerates protein backbone generation by integrating SaProt Tokenization with a Diffusion Transformer (DiT) architecture.<n>Experiments demonstrate that SaDiT outperforms state-of-the-art models, including RFDiffusion and Proteina, in both computational speed and structural viability.
arXiv Detail & Related papers (2026-02-06T13:50:13Z) - Protein generation with embedding learning for motif diversification [7.130556396588862]
A fundamental challenge in protein design is the trade-off between generating structural diversity and preserving motif biological function.<n>We introduce Protein Generation with Embedding Learning (PGEL), a framework that learns high-dimensional embeddings encoding sequence and structural features of a target motif.<n>PGEL is able to loosen geometric constraints while satisfying typical design metrics, leading to more diverse yet viable structures.
arXiv Detail & Related papers (2025-10-21T16:43:36Z) - ProteinAE: Protein Diffusion Autoencoders for Structure Encoding [64.77182442408254]
We introduce ProteinAE, a novel and streamlined protein diffusion autoencoder.<n>ProteinAE directly maps protein backbone coordinates from E(3) into a continuous, compact latent space.<n>We demonstrate that ProteinAE achieves state-of-the-art reconstruction quality, outperforming existing autoencoders.
arXiv Detail & Related papers (2025-10-12T14:30:32Z) - The Latent Road to Atoms: Backmapping Coarse-grained Protein Structures with Latent Diffusion [19.85659309869674]
Latent Diffusion Backmapping (LDB) is a novel approach leveraging denoising diffusion within latent space to address these challenges.
We evaluate LDB's state-of-the-art performance on three distinct protein datasets.
Our results position LDB as a powerful and scalable approach for backmapping, effectively bridging the gap between CG simulations and atomic-level analyses in computational biology.
arXiv Detail & Related papers (2024-10-17T06:38:07Z) - AbDiffuser: Full-Atom Generation of in vitro Functioning Antibodies [44.149969082612486]
AbDiffuser is an equivariant and physics-informed diffusion model for antibody 3D structures and sequences.
Our approach improves protein diffusion by taking advantage of domain knowledge and physics-based constraints.
Numerical experiments showcase the ability of AbDiffuser to generate antibodies that closely track the sequence and structural properties of a reference set.
arXiv Detail & Related papers (2023-07-28T11:57:44Z) - Internal-Coordinate Density Modelling of Protein Structure: Covariance
Matters [9.49959422062959]
We present a new strategy for modelling protein densities in internal coordinates, which uses constraints in 3D space to induce covariance structure between the internal degrees of freedom.
We demonstrate that our approach makes it possible to scale density models of internal coordinates to full protein backbones in two settings.
arXiv Detail & Related papers (2023-02-27T12:18:19Z) - State-specific protein-ligand complex structure prediction with a
multi-scale deep generative model [68.28309982199902]
We present NeuralPLexer, a computational approach that can directly predict protein-ligand complex structures.
Our study suggests that a data-driven approach can capture the structural cooperativity between proteins and small molecules, showing promise in accelerating the design of enzymes, drug molecules, and beyond.
arXiv Detail & Related papers (2022-09-30T01:46:38Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z) - Independent SE(3)-Equivariant Models for End-to-End Rigid Protein
Docking [57.2037357017652]
We tackle rigid body protein-protein docking, i.e., computationally predicting the 3D structure of a protein-protein complex from the individual unbound structures.
We design a novel pairwise-independent SE(3)-equivariant graph matching network to predict the rotation and translation to place one of the proteins at the right docked position.
Our model, named EquiDock, approximates the binding pockets and predicts the docking poses using keypoint matching and alignment.
arXiv Detail & Related papers (2021-11-15T18:46:37Z) - G-VAE, a Geometric Convolutional VAE for ProteinStructure Generation [41.66010308405784]
We introduce a joint geometric-neural networks approach for comparing, deforming and generating 3D protein structures.
Our method is able to generate plausible structures, different from the structures in the training data.
arXiv Detail & Related papers (2021-06-22T16:52:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.