Embryology of a Language Model
- URL: http://arxiv.org/abs/2508.00331v1
- Date: Fri, 01 Aug 2025 05:39:41 GMT
- Title: Embryology of a Language Model
- Authors: George Wang, Garrett Baker, Andrew Gordon, Daniel Murfet,
- Abstract summary: In this work, we introduce an embryological approach, applying UMAP to the susceptibility matrix to visualize the model's structural development over training.<n>Our visualizations reveal the emergence of a clear body plan'' charting the formation of known features like the induction circuit and discovering previously unknown structures.
- Score: 1.1874560263468232
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding how language models develop their internal computational structure is a central problem in the science of deep learning. While susceptibilities, drawn from statistical physics, offer a promising analytical tool, their full potential for visualizing network organization remains untapped. In this work, we introduce an embryological approach, applying UMAP to the susceptibility matrix to visualize the model's structural development over training. Our visualizations reveal the emergence of a clear ``body plan,'' charting the formation of known features like the induction circuit and discovering previously unknown structures, such as a ``spacing fin'' dedicated to counting space tokens. This work demonstrates that susceptibility analysis can move beyond validation to uncover novel mechanisms, providing a powerful, holistic lens for studying the developmental principles of complex neural networks.
Related papers
- Concept-Based Mechanistic Interpretability Using Structured Knowledge Graphs [3.429783703166407]
Our framework enables a global dissection of model behavior by analyzing how high-level semantic attributes emerge, interact, and propagate through internal model components.<n>A key innovation is our visualization platform that we named BAGEL, which presents these insights in a structured knowledge graph.<n>Our framework is model-agnostic, scalable, and contributes to a deeper understanding of how deep learning models generalize (or fail to) in the presence of dataset biases.
arXiv Detail & Related papers (2025-07-08T09:30:20Z) - Why Neural Network Can Discover Symbolic Structures with Gradient-based Training: An Algebraic and Geometric Foundation for Neurosymbolic Reasoning [73.18052192964349]
We develop a theoretical framework that explains how discrete symbolic structures can emerge naturally from continuous neural network training dynamics.<n>By lifting neural parameters to a measure space and modeling training as Wasserstein gradient flow, we show that under geometric constraints, the parameter measure $mu_t$ undergoes two concurrent phenomena.
arXiv Detail & Related papers (2025-06-26T22:40:30Z) - Unveiling and Steering Connectome Organization with Interpretable Latent Variables [11.702970031377307]
We propose a framework that combines subgraph extraction from the Drosophila connectome, FlyWire, with a generative model to derive interpretable low-dimensional representations of neural circuitry.<n>This research offers a novel tool for understanding brain architecture and a potential avenue for designing bio-inspired artificial neural networks.
arXiv Detail & Related papers (2025-05-19T11:54:40Z) - Concept-Guided Interpretability via Neural Chunking [54.73787666584143]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract these emerging entities, complementing each other based on label availability and dimensionality.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z) - Meta-Representational Predictive Coding: Biomimetic Self-Supervised Learning [51.22185316175418]
We present a new form of predictive coding that we call meta-representational predictive coding (MPC)<n>MPC sidesteps the need for learning a generative model of sensory input by learning to predict representations of sensory input across parallel streams.
arXiv Detail & Related papers (2025-03-22T22:13:14Z) - Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Spatial embedding promotes a specific form of modularity with low entropy and heterogeneous spectral dynamics [0.0]
Spatially embedded recurrent neural networks provide a promising avenue to study how modelled constraints shape the combined structural and functional organisation of networks over learning.
We show that it is possible to study these restrictions through entropic measures of the neural weights and eigenspectrum, across both rate and spiking neural networks.
This work deepens our understanding of constrained learning in neural networks, across coding schemes and tasks, where solutions to simultaneous structural and functional objectives must be accomplished in tandem.
arXiv Detail & Related papers (2024-09-26T10:00:05Z) - Unraveling Feature Extraction Mechanisms in Neural Networks [10.13842157577026]
We propose a theoretical approach based on Neural Tangent Kernels (NTKs) to investigate such mechanisms.
We reveal how these models leverage statistical features during gradient descent and how they are integrated into final decisions.
We find that while self-attention and CNN models may exhibit limitations in learning n-grams, multiplication-based models seem to excel in this area.
arXiv Detail & Related papers (2023-10-25T04:22:40Z) - Graph Foundation Models: Concepts, Opportunities and Challenges [66.37994863159861]
Foundation models have emerged as critical components in a variety of artificial intelligence applications.<n>The capabilities of foundation models in generalization and adaptation motivate graph machine learning researchers to discuss the potential of developing a new graph learning paradigm.<n>This article introduces the concept of Graph Foundation Models (GFMs), and offers an exhaustive explanation of their key characteristics and underlying technologies.
arXiv Detail & Related papers (2023-10-18T09:31:21Z) - A Novel Neural-symbolic System under Statistical Relational Learning [47.30190559449236]
We propose a neural-symbolic framework based on statistical relational learning, referred to as NSF-SRL.<n>Results of symbolic reasoning are utilized to refine and correct the predictions made by deep learning models, while deep learning models enhance the efficiency of the symbolic reasoning process.<n>We believe that this approach sets a new standard for neural-symbolic systems and will drive future research in the field of general artificial intelligence.
arXiv Detail & Related papers (2023-09-16T09:15:37Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.