Related papers: Training-Driven Representational Geometry Modularization Predicts Brain Alignment in Language Models

Training-Driven Representational Geometry Modularization Predicts Brain Alignment in Language Models

URL: http://arxiv.org/abs/2602.07539v1
Date: Sat, 07 Feb 2026 13:26:21 GMT
Title: Training-Driven Representational Geometry Modularization Predicts Brain Alignment in Language Models
Authors: Yixuan Liu, Zhiyuan Ma, Likai Tang, Runmin Gan, Xinche Zhang, Jinhao Li, Chao Xie, Sen Song,
Abstract summary: How large language models (LLMs) align with the neural representation and computation of human language is a central question in cognitive science.<n>We identified a geometric modularization where layers self-organize into stable low- and high-complexity clusters.<n>The low-complexity module, characterized by reduced entropy and curvature, consistently better predicted human language network activity.
Score: 10.7573063848449
License: http://creativecommons.org/licenses/by/4.0/
Abstract: How large language models (LLMs) align with the neural representation and computation of human language is a central question in cognitive science. Using representational geometry as a mechanistic lens, we addressed this by tracking entropy, curvature, and fMRI encoding scores throughout Pythia (70M-1B) training. We identified a geometric modularization where layers self-organize into stable low- and high-complexity clusters. The low-complexity module, characterized by reduced entropy and curvature, consistently better predicted human language network activity. This alignment followed heterogeneous spatial-temporal trajectories: rapid and stable in temporal regions (AntTemp, PostTemp), but delayed and dynamic in frontal areas (IFG, IFGorb). Crucially, reduced curvature remained a robust predictor of model-brain alignment even after controlling for training progress, an effect that strengthened with model scale. These results links training-driven geometric reorganization to temporal-frontal functional specialization, suggesting that representational smoothing facilitates neural-like linguistic processing.

Related papers

Dynamical Systems Analysis Reveals Functional Regimes in Large Language Models [0.8694591156258423]
Large language models perform text generation through high-dimensional internal dynamics.<n>Most interpretability approaches emphasise static representations or causal interventions, leaving temporal structure largely unexplored.<n>We discuss a composite dynamical metric, computed from activation time-series during autoregressive generation.
arXiv Detail & Related papers (2026-01-11T21:57:52Z)
DiffeoMorph: Learning to Morph 3D Shapes Using Differentiable Agent-Based Simulations [2.174820084855635]
DiffeoMorph is an end-to-end differentiable framework for learning a morphogenesis protocol.<n>It guides a population of agents to morph into a target 3D shape.<n>We show that DiffeoMorph can form a range of shapes using only minimal cues.
arXiv Detail & Related papers (2025-12-18T23:50:42Z)
Neuronal Group Communication for Efficient Neural representation [85.36421257648294]
This paper addresses the question of how to build large neural systems that learn efficient, modular, and interpretable representations.<n>We propose Neuronal Group Communication (NGC), a theory-driven framework that reimagines a neural network as a dynamical system of interacting neuronal groups.<n>NGC treats weights as transient interactions between embedding-like neuronal states, with neural computation unfolding through iterative communication among groups of neurons.
arXiv Detail & Related papers (2025-10-19T14:23:35Z)
Sparse Autoencoder Neural Operators: Model Recovery in Function Spaces [75.45093712182624]
We introduce a framework that extends sparse autoencoders (SAEs) to lifted spaces and infinite-dimensional function spaces, enabling mechanistic interpretability of large neural operators (NO)<n>We compare the inference and training dynamics of SAEs, lifted-SAE, and SAE neural operators.<n>We highlight how lifting and operator modules introduce beneficial inductive biases, enabling faster recovery, improved recovery of smooth concepts, and robust inference across varying resolutions, a property unique to neural operators.
arXiv Detail & Related papers (2025-09-03T21:57:03Z)
Beyond Ensembles: Simulating All-Atom Protein Dynamics in a Learned Latent Space [4.5211402678313135]
We introduce the Graph Latent Dynamics Propagator (GLDP), a modular component for simulating dynamics within the learned latent space of LD-FPG.<n>We compare three classes of propagators: score-guided Langevin dynamics, (ii) Koopman-based linear operators, and (iii) autoregressive neural networks.<n>Within a unified encoder-propagator-decoder framework, we evaluate long-horizon stability, backbone and side-chain ensemble fidelity, and functional free-energy landscapes.
arXiv Detail & Related papers (2025-09-02T11:09:06Z)
Equivariant U-Shaped Neural Operators for the Cahn-Hilliard Phase-Field Model [4.79907962230318]
We show that an equivariant U-shaped neural operator (E-UNO) can learn the evolution of the phase-field variable from short histories of past dynamics.<n>By encoding symmetry and scale hierarchy, the model generalizes better, requires less training data, and yields physically consistent dynamics.
arXiv Detail & Related papers (2025-09-01T09:25:31Z)
Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks [59.552873049024775]
We show that compute-optimally trained models exhibit a remarkably precise universality.<n>With learning rate decay, the collapse becomes so tight that differences in the normalized curves across models fall below the noise floor.<n>We explain these phenomena by connecting collapse to the power-law structure in typical neural scaling laws.
arXiv Detail & Related papers (2025-07-02T20:03:34Z)
Geometric sparsification in recurrent neural networks [0.8851237804522972]
A common technique for ameliorating the computational costs of running large neural models is sparsification.<n>We propose a new technique for sparsification of recurrent neural nets (RNNs) called moduli regularization.<n>We show that moduli regularization induces more stable recurrent neural nets, and achieves high fidelity models above 90% sparsity.
arXiv Detail & Related papers (2024-06-10T14:12:33Z)
Equivariant Graph Neural Operator for Modeling 3D Dynamics [148.98826858078556]
We propose Equivariant Graph Neural Operator (EGNO) to directly models dynamics as trajectories instead of just next-step prediction. EGNO explicitly learns the temporal evolution of 3D dynamics where we formulate the dynamics as a function over time and learn neural operators to approximate it. Comprehensive experiments in multiple domains, including particle simulations, human motion capture, and molecular dynamics, demonstrate the significantly superior performance of EGNO against existing methods.
arXiv Detail & Related papers (2024-01-19T21:50:32Z)
Interpretable statistical representations of neural population dynamics and geometry [4.459704414303749]
We introduce a representation learning method, MARBLE, that decomposes on-manifold dynamics into local flow fields and maps them into a common latent space. In simulated non-linear dynamical systems, recurrent neural networks, and experimental single-neuron recordings from primates and rodents, we discover emergent low-dimensional latent representations. These representations are consistent across neural networks and animals, enabling the robust comparison of cognitive computations.
arXiv Detail & Related papers (2023-04-06T21:11:04Z)
Contrastive-Signal-Dependent Plasticity: Self-Supervised Learning in Spiking Neural Circuits [61.94533459151743]
This work addresses the challenge of designing neurobiologically-motivated schemes for adjusting the synapses of spiking networks. Our experimental simulations demonstrate a consistent advantage over other biologically-plausible approaches when training recurrent spiking networks.
arXiv Detail & Related papers (2023-03-30T02:40:28Z)
NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go [109.88509362837475]
We present NeuroMorph, a new neural network architecture that takes as input two 3D shapes. NeuroMorph produces smooth and point-to-point correspondences between them. It works well for a large variety of input shapes, including non-isometric pairs from different object categories.
arXiv Detail & Related papers (2021-06-17T12:25:44Z)
GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles [60.12186997181117]
Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery. Existing generative models have several drawbacks including lack of modeling important molecular geometry elements. We propose GeoMol, an end-to-end, non-autoregressive and SE(3)-invariant machine learning approach to generate 3D conformers.
arXiv Detail & Related papers (2021-06-08T14:17:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.