Related papers: Intrinsic Task Symmetry Drives Generalization in Algorithmic Tasks

Intrinsic Task Symmetry Drives Generalization in Algorithmic Tasks

URL: http://arxiv.org/abs/2603.01968v1
Date: Mon, 02 Mar 2026 15:19:24 GMT
Title: Intrinsic Task Symmetry Drives Generalization in Algorithmic Tasks
Authors: Hyeonbin Hwang, Yeachan Park,
Abstract summary: We identify a consistent three-stage training dynamic underlying grokking.<n>We show that generalization emerges during the symmetry acquisition phase.<n>We introduce a symmetry-based diagnostic that anticipates the onset of generalization and propose strategies to accelerate it.
Score: 4.075225553131796
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Grokking, the sudden transition from memorization to generalization, is characterized by the emergence of low-dimensional representations, yet the mechanism underlying this organization remains elusive. We propose that intrinsic task symmetries primarily drive grokking and shape the geometry of the model's representation space. We identify a consistent three-stage training dynamic underlying grokking: (i) memorization, (ii) symmetry acquisition, and (iii) geometric organization. We show that generalization emerges during the symmetry acquisition phase, after which representations reorganize into a structured, task-aligned geometry. We validate this symmetry-driven account across diverse algorithmic domains, including algebraic, structural, and relational reasoning tasks. Building on these findings, we introduce a symmetry-based diagnostic that anticipates the onset of generalization and propose strategies to accelerate it. Together, our results establish intrinsic symmetry as the key factor enabling neural networks to move beyond memorization and achieve robust algorithmic reasoning.

Related papers

Developmental Symmetry-Loss: A Free-Energy Perspective on Brain-Inspired Invariance Learning [0.0]
We propose Symmetry-Loss, a brain-inspired algorithmic principle.<n>We show how Symmetry-Loss operationalizes a Free-Energy-like objective for representation learning.<n>The result is a general computational mechanism linking developmental learning in the brain with principled representation learning in artificial systems.
arXiv Detail & Related papers (2025-12-04T22:12:15Z)
Twirlator: A Pipeline for Analyzing Subgroup Symmetry Effects in Quantum Machine Learning Ansatzes [3.54873963145126]
symmetries have been a key driver of performance gains in geometric deep learning and geometric and equivariant quantum machine learning.<n>While symmetrization appears to be a promising method, its practical overhead, such as additional gates, reduced expressibility, and other factors, is not well understood in quantum machine learning.<n>We develop an automated pipeline to measure various characteristics of quantum machine learning ansatzes with respect to symmetries that can appear in the learning task.
arXiv Detail & Related papers (2025-11-06T10:29:24Z)
Axis-level Symmetry Detection with Group-Equivariant Representation [48.813587457507786]
Recent heatmap-based approaches can localize potential regions of symmetry axes but often lack precision in identifying individual axes.<n>We propose a novel framework for axis-level detection of the two most common symmetry types-reflection and rotation.<n>Our method achieves state-of-the-art performance, outperforming existing approaches.
arXiv Detail & Related papers (2025-08-14T15:26:53Z)
On two-dimensional tensor network group symmetries [0.0]
We introduce two-dimensional tensor network representations of finite groups carrying a 4-cocycle index.<n>We characterize the associated gapped (2+1)D phases that emerge when these anomalous symmetries act on tensor network ground states.
arXiv Detail & Related papers (2025-07-22T11:28:59Z)
Generalized Linear Mode Connectivity for Transformers [87.32299363530996]
A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths.<n>Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope.<n>We introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, transformations, and general invertible maps.<n>This generalization enables, for the first time, the discovery of low- and zero-barrier linear paths between independently trained Vision Transformers and GPT-2 models.
arXiv Detail & Related papers (2025-06-28T01:46:36Z)
Why Neural Network Can Discover Symbolic Structures with Gradient-based Training: An Algebraic and Geometric Foundation for Neurosymbolic Reasoning [73.18052192964349]
We develop a theoretical framework that explains how discrete symbolic structures can emerge naturally from continuous neural network training dynamics.<n>By lifting neural parameters to a measure space and modeling training as Wasserstein gradient flow, we show that under geometric constraints, the parameter measure $mu_t$ undergoes two concurrent phenomena.
arXiv Detail & Related papers (2025-06-26T22:40:30Z)
Relative Representations: Topological and Geometric Perspectives [50.85040046976025]
Relative representations are an established approach to zero-shot model stitching.<n>We introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations.<n>Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes.
arXiv Detail & Related papers (2024-09-17T08:09:22Z)
Current Symmetry Group Equivariant Convolution Frameworks for Representation Learning [5.802794302956837]
Euclidean deep learning is often inadequate for addressing real-world signals where the representation space is irregular and curved with complex topologies. We focus on the importance of symmetry group equivariant deep learning models and their realization of convolution-like operations on graphs, 3D shapes, and non-Euclidean spaces.
arXiv Detail & Related papers (2024-09-11T15:07:18Z)
Identifying the Group-Theoretic Structure of Machine-Learned Symmetries [41.56233403862961]
We propose methods for examining and identifying the group-theoretic structure of such machine-learned symmetries. As an application to particle physics, we demonstrate the identification of the residual symmetries after the spontaneous breaking of non-Abelian gauge symmetries.
arXiv Detail & Related papers (2023-09-14T17:03:50Z)
A Unified Framework for Discovering Discrete Symmetries [17.687122467264487]
We consider the problem of learning a function respecting a symmetry from among a class of symmetries. We develop a unified framework that enables symmetry discovery across a broad range of subgroups.
arXiv Detail & Related papers (2023-09-06T10:41:30Z)
Oracle-Preserving Latent Flows [58.720142291102135]
We develop a methodology for the simultaneous discovery of multiple nontrivial continuous symmetries across an entire labelled dataset. The symmetry transformations and the corresponding generators are modeled with fully connected neural networks trained with a specially constructed loss function. The two new elements in this work are the use of a reduced-dimensionality latent space and the generalization to transformations invariant with respect to high-dimensional oracles.
arXiv Detail & Related papers (2023-02-02T00:13:32Z)
Deep Learning Symmetries and Their Lie Groups, Algebras, and Subalgebras from First Principles [55.41644538483948]
We design a deep-learning algorithm for the discovery and identification of the continuous group of symmetries present in a labeled dataset. We use fully connected neural networks to model the transformations symmetry and the corresponding generators. Our study also opens the door for using a machine learning approach in the mathematical study of Lie groups and their properties.
arXiv Detail & Related papers (2023-01-13T16:25:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.