Triangle Multiplication Is All You Need For Biomolecular Structure Representations
- URL: http://arxiv.org/abs/2510.18870v1
- Date: Tue, 21 Oct 2025 17:59:02 GMT
- Title: Triangle Multiplication Is All You Need For Biomolecular Structure Representations
- Authors: Jeffrey Ouyang-Zhang, Pranav Murugan, Daniel J. Diaz, Gianluca Scarpellini, Richard Strong Bowen, Nate Gruver, Adam Klivans, Philipp Krähenbühl, Aleksandra Faust, Maruan Al-Shedivat,
- Abstract summary: We introduce Pairmixer, a streamlined alternative that eliminates triangle attention while preserving higher-order geometric reasoning capabilities.<n>Pairmixer substantially improves computational efficiency, matching state-of-the-art structure predictors across folding and docking benchmarks.<n>Within BoltzDesign, for example, Pairmixer delivers over 2x faster sampling and scales to sequences 30% longer than the memory limits of Pairformer.
- Score: 56.26342479807906
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AlphaFold has transformed protein structure prediction, but emerging applications such as virtual ligand screening, proteome-wide folding, and de novo binder design demand predictions at a massive scale, where runtime and memory costs become prohibitive. A major bottleneck lies in the Pairformer backbone of AlphaFold3-style models, which relies on computationally expensive triangular primitives-especially triangle attention-for pairwise reasoning. We introduce Pairmixer, a streamlined alternative that eliminates triangle attention while preserving higher-order geometric reasoning capabilities that are critical for structure prediction. Pairmixer substantially improves computational efficiency, matching state-of-the-art structure predictors across folding and docking benchmarks, delivering up to 4x faster inference on long sequences while reducing training cost by 34%. Its efficiency alleviates the computational burden of downstream applications such as modeling large protein complexes, high-throughput ligand and binder screening, and hallucination-based design. Within BoltzDesign, for example, Pairmixer delivers over 2x faster sampling and scales to sequences ~30% longer than the memory limits of Pairformer.
Related papers
- Protenix-Mini+: efficient structure prediction model with scalable pairformer [17.839471210239186]
Protenix-Mini+ is a highly lightweight and scalable variant of the Protenix model.<n>Within an acceptable range of performance degradation, it substantially improves computational efficiency.
arXiv Detail & Related papers (2025-10-13T21:54:35Z) - Predictive Feature Caching for Training-free Acceleration of Molecular Geometry Generation [67.20779609022108]
Flow matching models generate high-fidelity molecular geometries but incur significant computational costs during inference.<n>This work discusses a training-free caching strategy that accelerates molecular geometry generation.<n> Experiments on the GEOM-Drugs dataset demonstrate that caching achieves a twofold reduction in wall-clock inference time.
arXiv Detail & Related papers (2025-10-06T09:49:14Z) - Protenix-Mini: Efficient Structure Predictor via Compact Architecture, Few-Step Diffusion and Switchable pLM [37.865341638265534]
We present Protenix-Mini, a compact and optimized model for efficient protein structure prediction.<n>By eliminating redundant Transformer components and refining the sampling process, Protenix-Mini significantly reduces model complexity with slight accuracy drop.
arXiv Detail & Related papers (2025-07-16T02:08:25Z) - AutoHFormer: Efficient Hierarchical Autoregressive Transformer for Time Series Prediction [36.239648954658534]
Time series forecasting requires architectures that simultaneously achieve three competing objectives.<n>We introduce AutoHFormer, a hierarchical autoregressive transformer that addresses these challenges.<n> Comprehensive experiments demonstrate that AutoHFormer 10.76X faster training and 6.06X memory reduction compared to PatchTST on P08.
arXiv Detail & Related papers (2025-06-19T03:47:04Z) - CompGS++: Compressed Gaussian Splatting for Static and Dynamic Scene Representation [60.712165339762116]
CompGS++ is a novel framework that leverages compact Gaussian primitives to achieve accurate 3D modeling.<n>Our design is based on the principle of eliminating redundancy both between and within primitives.<n>Our implementation will be made publicly available on GitHub to facilitate further research.
arXiv Detail & Related papers (2025-04-17T15:33:01Z) - Unified Cross-Scale 3D Generation and Understanding via Autoregressive Modeling [32.45851798752336]
We propose Uni-3DAR, a unified autoregressive framework for cross-scale 3D generation and understanding.<n>At its core is a coarse-to-fine tokenizer based on octree data structures, which compresses diverse 3D structures into compact 1D token sequences.<n>To address the challenge of dynamically varying token positions introduced by compression, we introduce a masked next-token prediction strategy.
arXiv Detail & Related papers (2025-03-20T16:07:04Z) - Memory-Efficient 4-bit Preconditioned Stochastic Optimization [53.422307389223626]
We introduce 4-bit quantization for Shampoo's preconditioners.<n>To our knowledge, this is the first quantization approach applied to Cholesky factors of preconditioners.<n>We demonstrate that combining Cholesky quantization with error feedback enhances memory efficiency and algorithm performance.
arXiv Detail & Related papers (2024-12-14T03:32:54Z) - Geometric Algebra Planes: Convex Implicit Neural Volumes [70.12234371845445]
We show that GA-Planes is equivalent to a sparse low-rank factor plus low-resolution matrix.
We also show that GA-Planes can be adapted for many existing representations.
arXiv Detail & Related papers (2024-11-20T18:21:58Z) - Video Prediction Transformers without Recurrence or Convolution [65.93130697098658]
We propose PredFormer, a framework entirely based on Gated Transformers.<n>We provide a comprehensive analysis of 3D Attention in the context of video prediction.<n>The significant improvements in both accuracy and efficiency highlight the potential of PredFormer.
arXiv Detail & Related papers (2024-10-07T03:52:06Z) - Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers [4.674454841332859]
Transformer-based models have emerged as one of the most widely used architectures for natural language processing.<n>These huge models are memory hungry and incur significant inference latency even on cutting edge AI-accelerators.<n>We propose LeanAttention, a scalable technique of computing self-attention for the token-generation phase.
arXiv Detail & Related papers (2024-05-17T00:52:39Z) - Sketching as a Tool for Understanding and Accelerating Self-attention
for Long Sequences [52.6022911513076]
Transformer-based models are not efficient in processing long sequences due to the quadratic space and time complexity of the self-attention modules.
We propose Linformer and Informer to reduce the quadratic complexity to linear (modulo logarithmic factors) via low-dimensional projection and row selection.
Based on the theoretical analysis, we propose Skeinformer to accelerate self-attention and further improve the accuracy of matrix approximation to self-attention.
arXiv Detail & Related papers (2021-12-10T06:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.