Toward Manifest Relationality in Transformers via Symmetry Reduction
- URL: http://arxiv.org/abs/2602.18948v1
- Date: Sat, 21 Feb 2026 19:43:17 GMT
- Title: Toward Manifest Relationality in Transformers via Symmetry Reduction
- Authors: J. François, L. Ravera,
- Abstract summary: Transformer models contain substantial internal redundancy.<n>Recent approaches address this by explicitly breaking symmetry.<n>We propose a complementary framework based on symmetry reduction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer models contain substantial internal redundancy arising from coordinate-dependent representations and continuous symmetries, in model space and in head space, respectively. While recent approaches address this by explicitly breaking symmetry, we propose a complementary framework based on symmetry reduction. We reformulate representations, attention mechanisms, and optimization dynamics in terms of invariant relational quantities, eliminating redundant degrees of freedom by construction. This perspective yields architectures that operate directly on relational structures, providing a principled geometric framework for reducing parameter redundancy and analyzing optimization.
Related papers
- Scale redundancy and soft gauge fixing in positively homogeneous neural networks [0.0]
Neural networks with positively homogeneous activations exhibit an exact continuous reparametrization symmetry.<n>We introduce gauge-adapted coordinates that separate invariant and scale-imbalance directions.<n>Inspired by gauge fixing in field theory, we introduce a soft orbit-selection functional acting only on redundant scale coordinates.
arXiv Detail & Related papers (2026-02-16T13:21:49Z) - Deep Delta Learning [91.75868893250662]
We introduce Deep Delta Learning (DDL), a novel architecture that generalizes the standard residual connection.<n>We provide a spectral analysis of this operator, demonstrating that the gate $(mathbfX)$ enables dynamic between identity mapping, projection, and geometric reflection.<n>This unification empowers the network to explicitly control the spectrum of its layer-wise transition operator, enabling the modeling of complex, non-monotonic dynamics.
arXiv Detail & Related papers (2026-01-01T18:11:38Z) - From Coefficients to Directions: Rethinking Model Merging with Directional Alignment [66.99062575537555]
We introduce a unified geometric framework, emphMerging with Directional Alignment (method), which aligns directional structures consistently in both the parameter and feature spaces.<n>Our analysis shows that directional alignment improves structural coherence, and extensive experiments across benchmarks, model scales, and task configurations further validate the effectiveness of our approach.
arXiv Detail & Related papers (2025-11-29T08:40:58Z) - RiemanLine: Riemannian Manifold Representation of 3D Lines for Factor Graph Optimization [49.83974390433746]
This paper introduces textbfRiemanLine, a unified minimal representation for 3D lines.<n>Our key idea is to decouple each line landmark into global and local components.<n>Experiments on ICL-NUIM, TartanAir, and synthetic benchmarks demonstrate that our method achieves significantly more accurate pose estimation and line reconstruction.
arXiv Detail & Related papers (2025-08-06T11:27:38Z) - Generalized Linear Mode Connectivity for Transformers [87.32299363530996]
A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths.<n>Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope.<n>We introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, transformations, and general invertible maps.<n>This generalization enables, for the first time, the discovery of low- and zero-barrier linear paths between independently trained Vision Transformers and GPT-2 models.
arXiv Detail & Related papers (2025-06-28T01:46:36Z) - Symmetry in Neural Network Parameter Spaces [36.618818500498676]
A significant portion of redundancy is explained by symmetries in the parameter space--transformations that leave the network function unchanged.<n>These symmetries shape the loss landscape and constrain learning dynamics, offering a new lens for understanding optimization, generalization, and model complexity.<n>We summarize existing literature, uncover connections between symmetry and learning theory, and identify gaps and opportunities in this emerging field.
arXiv Detail & Related papers (2025-06-16T00:59:12Z) - Sharper Convergence Rates for Nonconvex Optimisation via Reduction Mappings [38.819359908152656]
We show that well-designed reduction mappings improve curvature properties of the objective, leading to better-conditioned problems and theoretically faster convergence for gradient-based methods.<n>Our analysis unifies a range of scenarios where structural information at optimality is leveraged to accelerate convergence, offering a principled explanation for the empirical gains observed in such optimisation algorithms.
arXiv Detail & Related papers (2025-06-10T04:03:59Z) - Relative Representations: Topological and Geometric Perspectives [50.85040046976025]
Relative representations are an established approach to zero-shot model stitching.<n>We introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations.<n>Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes.
arXiv Detail & Related papers (2024-09-17T08:09:22Z) - Counting Phases and Faces Using Bayesian Thermodynamic Integration [77.34726150561087]
We introduce a new approach to reconstruction of the thermodynamic functions and phase boundaries in two-parametric statistical mechanics systems.
We use the proposed approach to accurately reconstruct the partition functions and phase diagrams of the Ising model and the exactly solvable non-equilibrium TASEP.
arXiv Detail & Related papers (2022-05-18T17:11:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.