Platonic Transformers: A Solid Choice For Equivariance
- URL: http://arxiv.org/abs/2510.03511v2
- Date: Wed, 08 Oct 2025 00:09:15 GMT
- Title: Platonic Transformers: A Solid Choice For Equivariance
- Authors: Mohammad Mohaiminul Islam, Rishabh Anand, David R. Wessels, Friso de Kruiff, Thijs P. Kuipers, Rex Ying, Clara I. Sánchez, Sharvaree Vadgama, Georg Bökman, Erik J. Bekkers,
- Abstract summary: We introduce the Platonic Transformer to resolve this trade-off.<n>By defining attention relative to reference frames from the Platonic solid symmetry groups, our method induces a principled weight-sharing scheme.<n>We show that this attention is formally equivalent to a dynamic group convolution, which reveals that the model learns adaptive geometric filters.
- Score: 25.29042615187704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While widespread, Transformers lack inductive biases for geometric symmetries common in science and computer vision. Existing equivariant methods often sacrifice the efficiency and flexibility that make Transformers so effective through complex, computationally intensive designs. We introduce the Platonic Transformer to resolve this trade-off. By defining attention relative to reference frames from the Platonic solid symmetry groups, our method induces a principled weight-sharing scheme. This enables combined equivariance to continuous translations and Platonic symmetries, while preserving the exact architecture and computational cost of a standard Transformer. Furthermore, we show that this attention is formally equivalent to a dynamic group convolution, which reveals that the model learns adaptive geometric filters and enables a highly scalable, linear-time convolutional variant. Across diverse benchmarks in computer vision (CIFAR-10), 3D point clouds (ScanObjectNN), and molecular property prediction (QM9, OMol25), the Platonic Transformer achieves competitive performance by leveraging these geometric constraints at no additional cost.
Related papers
- Fighter: Unveiling the Graph Convolutional Nature of Transformers in Time Series Modeling [33.595964789473065]
This work demystifies the Transformer encoder by establishing its fundamental equivalence to a Graph Convolutional Network (GCN)<n>We propose textbfFighter (Flexible Graph Convolutional Transformer), a streamlined architecture that removes redundant linear projections and incorporates multi-hop graph aggregation.
arXiv Detail & Related papers (2025-10-20T02:42:14Z) - Geometrically Constrained and Token-Based Probabilistic Spatial Transformers [5.437226012505534]
We revisit Spatial Transformer Networks (STNs) as a canonicalization tool for transformer-based vision pipelines.<n>We propose a probabilistic, component-wise extension that improves robustness.<n> Experiments on challenging moth classification benchmarks demonstrate that our method consistently improves robustness compared to other STNs.
arXiv Detail & Related papers (2025-09-14T11:30:53Z) - Generalized Linear Mode Connectivity for Transformers [87.32299363530996]
A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths.<n>Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope.<n>We introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, transformations, and general invertible maps.<n>This generalization enables, for the first time, the discovery of low- and zero-barrier linear paths between independently trained Vision Transformers and GPT-2 models.
arXiv Detail & Related papers (2025-06-28T01:46:36Z) - OT-Transformer: A Continuous-time Transformer Architecture with Optimal Transport Regularization [1.7180235064112577]
We consider a dynamical system whose governing equation is parametrized by transformer blocks.<n>We leverage optimal transport theory to regularize the training problem, which enhances stability in training and improves generalization of the resulting model.
arXiv Detail & Related papers (2025-01-30T22:52:40Z) - Variable-size Symmetry-based Graph Fourier Transforms for image compression [65.7352685872625]
We propose a new family of Symmetry-based Graph Fourier Transforms of variable sizes into a coding framework.
Our proposed algorithm generates symmetric graphs on the grid by adding specific symmetrical connections between nodes.
Experiments show that SBGFTs outperform the primary transforms integrated in the explicit Multiple Transform Selection.
arXiv Detail & Related papers (2024-11-24T13:00:44Z) - What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis [8.008567379796666]
We provide a fundamental understanding of what distinguishes the Transformer from the other architectures.<n>Our results suggest that various common architectural and optimization choices in Transformers can be traced back to their highly non-linear dependencies.
arXiv Detail & Related papers (2024-10-14T18:15:02Z) - Transolver: A Fast Transformer Solver for PDEs on General Geometries [66.82060415622871]
We present Transolver, which learns intrinsic physical states hidden behind discretized geometries.
By calculating attention to physics-aware tokens encoded from slices, Transovler can effectively capture intricate physical correlations.
Transolver achieves consistent state-of-the-art with 22% relative gain across six standard benchmarks and also excels in large-scale industrial simulations.
arXiv Detail & Related papers (2024-02-04T06:37:38Z) - Curve Your Attention: Mixed-Curvature Transformers for Graph
Representation Learning [77.1421343649344]
We propose a generalization of Transformers towards operating entirely on the product of constant curvature spaces.
We also provide a kernelized approach to non-Euclidean attention, which enables our model to run in time and memory cost linear to the number of nodes and edges.
arXiv Detail & Related papers (2023-09-08T02:44:37Z) - Learning Modulated Transformation in GANs [69.95217723100413]
We equip the generator in generative adversarial networks (GANs) with a plug-and-play module, termed as modulated transformation module (MTM)
MTM predicts spatial offsets under the control of latent codes, based on which the convolution operation can be applied at variable locations.
It is noteworthy that towards human generation on the challenging TaiChi dataset, we improve the FID of StyleGAN3 from 21.36 to 13.60, demonstrating the efficacy of learning modulated geometry transformation.
arXiv Detail & Related papers (2023-08-29T17:51:22Z) - B-cos Alignment for Inherently Interpretable CNNs and Vision
Transformers [97.75725574963197]
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training.
We show that a sequence of such transformations induces a single linear transformation that faithfully summarises the full model computations.
We show that the resulting explanations are of high visual quality and perform well under quantitative interpretability metrics.
arXiv Detail & Related papers (2023-06-19T12:54:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.