Related papers: Triangle Multiplication Is All You Need For Biomolecular Structure Representations

Triangle Multiplication Is All You Need For Biomolecular Structure Representations

URL: http://arxiv.org/abs/2510.18870v1
Date: Tue, 21 Oct 2025 17:59:02 GMT
Title: Triangle Multiplication Is All You Need For Biomolecular Structure Representations
Authors: Jeffrey Ouyang-Zhang, Pranav Murugan, Daniel J. Diaz, Gianluca Scarpellini, Richard Strong Bowen, Nate Gruver, Adam Klivans, Philipp Krähenbühl, Aleksandra Faust, Maruan Al-Shedivat,
Abstract summary: We introduce Pairmixer, a streamlined alternative that eliminates triangle attention while preserving higher-order geometric reasoning capabilities.<n>Pairmixer substantially improves computational efficiency, matching state-of-the-art structure predictors across folding and docking benchmarks.<n>Within BoltzDesign, for example, Pairmixer delivers over 2x faster sampling and scales to sequences 30% longer than the memory limits of Pairformer.
Score: 56.26342479807906
License: http://creativecommons.org/licenses/by/4.0/
Abstract: AlphaFold has transformed protein structure prediction, but emerging applications such as virtual ligand screening, proteome-wide folding, and de novo binder design demand predictions at a massive scale, where runtime and memory costs become prohibitive. A major bottleneck lies in the Pairformer backbone of AlphaFold3-style models, which relies on computationally expensive triangular primitives-especially triangle attention-for pairwise reasoning. We introduce Pairmixer, a streamlined alternative that eliminates triangle attention while preserving higher-order geometric reasoning capabilities that are critical for structure prediction. Pairmixer substantially improves computational efficiency, matching state-of-the-art structure predictors across folding and docking benchmarks, delivering up to 4x faster inference on long sequences while reducing training cost by 34%. Its efficiency alleviates the computational burden of downstream applications such as modeling large protein complexes, high-throughput ligand and binder screening, and hallucination-based design. Within BoltzDesign, for example, Pairmixer delivers over 2x faster sampling and scales to sequences ~30% longer than the memory limits of Pairformer.

Related papers

Protenix-Mini+: efficient structure prediction model with scalable pairformer [17.839471210239186]
Protenix-Mini+ is a highly lightweight and scalable variant of the Protenix model.<n>Within an acceptable range of performance degradation, it substantially improves computational efficiency.
arXiv Detail & Related papers (2025-10-13T21:54:35Z)
Predictive Feature Caching for Training-free Acceleration of Molecular Geometry Generation [67.20779609022108]
Flow matching models generate high-fidelity molecular geometries but incur significant computational costs during inference.<n>This work discusses a training-free caching strategy that accelerates molecular geometry generation.<n> Experiments on the GEOM-Drugs dataset demonstrate that caching achieves a twofold reduction in wall-clock inference time.
arXiv Detail & Related papers (2025-10-06T09:49:14Z)
Protenix-Mini: Efficient Structure Predictor via Compact Architecture, Few-Step Diffusion and Switchable pLM [37.865341638265534]
We present Protenix-Mini, a compact and optimized model for efficient protein structure prediction.<n>By eliminating redundant Transformer components and refining the sampling process, Protenix-Mini significantly reduces model complexity with slight accuracy drop.
arXiv Detail & Related papers (2025-07-16T02:08:25Z)
AutoHFormer: Efficient Hierarchical Autoregressive Transformer for Time Series Prediction [36.239648954658534]
Time series forecasting requires architectures that simultaneously achieve three competing objectives.<n>We introduce AutoHFormer, a hierarchical autoregressive transformer that addresses these challenges.<n> Comprehensive experiments demonstrate that AutoHFormer 10.76X faster training and 6.06X memory reduction compared to PatchTST on P08.
arXiv Detail & Related papers (2025-06-19T03:47:04Z)
CompGS++: Compressed Gaussian Splatting for Static and Dynamic Scene Representation [60.712165339762116]
CompGS++ is a novel framework that leverages compact Gaussian primitives to achieve accurate 3D modeling.<n>Our design is based on the principle of eliminating redundancy both between and within primitives.<n>Our implementation will be made publicly available on GitHub to facilitate further research.
arXiv Detail & Related papers (2025-04-17T15:33:01Z)
Unified Cross-Scale 3D Generation and Understanding via Autoregressive Modeling [32.45851798752336]
We propose Uni-3DAR, a unified autoregressive framework for cross-scale 3D generation and understanding.<n>At its core is a coarse-to-fine tokenizer based on octree data structures, which compresses diverse 3D structures into compact 1D token sequences.<n>To address the challenge of dynamically varying token positions introduced by compression, we introduce a masked next-token prediction strategy.
arXiv Detail & Related papers (2025-03-20T16:07:04Z)
Memory-Efficient 4-bit Preconditioned Stochastic Optimization [53.422307389223626]
We introduce 4-bit quantization for Shampoo's preconditioners.<n>To our knowledge, this is the first quantization approach applied to Cholesky factors of preconditioners.<n>We demonstrate that combining Cholesky quantization with error feedback enhances memory efficiency and algorithm performance.
arXiv Detail & Related papers (2024-12-14T03:32:54Z)
Geometric Algebra Planes: Convex Implicit Neural Volumes [70.12234371845445]
We show that GA-Planes is equivalent to a sparse low-rank factor plus low-resolution matrix. We also show that GA-Planes can be adapted for many existing representations.
arXiv Detail & Related papers (2024-11-20T18:21:58Z)
Video Prediction Transformers without Recurrence or Convolution [65.93130697098658]
We propose PredFormer, a framework entirely based on Gated Transformers.<n>We provide a comprehensive analysis of 3D Attention in the context of video prediction.<n>The significant improvements in both accuracy and efficiency highlight the potential of PredFormer.
arXiv Detail & Related papers (2024-10-07T03:52:06Z)
Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers [4.674454841332859]
Transformer-based models have emerged as one of the most widely used architectures for natural language processing.<n>These huge models are memory hungry and incur significant inference latency even on cutting edge AI-accelerators.<n>We propose LeanAttention, a scalable technique of computing self-attention for the token-generation phase.
arXiv Detail & Related papers (2024-05-17T00:52:39Z)
Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences [52.6022911513076]
Transformer-based models are not efficient in processing long sequences due to the quadratic space and time complexity of the self-attention modules. We propose Linformer and Informer to reduce the quadratic complexity to linear (modulo logarithmic factors) via low-dimensional projection and row selection. Based on the theoretical analysis, we propose Skeinformer to accelerate self-attention and further improve the accuracy of matrix approximation to self-attention.
arXiv Detail & Related papers (2021-12-10T06:58:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.