Related papers: Probing Neural Topology of Large Language Models

Probing Neural Topology of Large Language Models

URL: http://arxiv.org/abs/2506.01042v1
Date: Sun, 01 Jun 2025 14:57:03 GMT
Title: Probing Neural Topology of Large Language Models
Authors: Yu Zheng, Yuan Yuan, Yong Li, Paolo Santi,
Abstract summary: We introduce graph probing, a method for uncovering the functional connectivity topology of LLM neurons.<n>We find a universal predictability of next-token prediction performance using only neural topology.<n>This predictability is robust even when retaining just 1% of neuron connections or probing models after only 8 pretraining steps.
Score: 15.34202977968525
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Probing large language models (LLMs) has yielded valuable insights into their internal mechanisms by linking neural representations to interpretable semantics. However, how neurons functionally co-activate with each other to give rise to emergent capabilities remains largely unknown, hindering a deeper understanding and safer development of LLMs. In this work, we introduce graph probing, a method for uncovering the functional connectivity topology of LLM neurons and relating it to language generation performance. By analyzing internal neural graphs across diverse LLM families and scales, we discover a universal predictability of next-token prediction performance using only neural topology. This predictability is robust even when retaining just 1% of neuron connections or probing models after only 8 pretraining steps, highlighting the sparsity and early emergence of topological patterns. Further graph matching analysis suggests that, despite significant distinctions in architectures, parameters, and training data, different LLMs develop intricate and consistent neural topological structures that may form the foundation for their language generation abilities. Codes and data for the graph probing toolbox are released at https://github.com/DavyMorgan/llm-graph-probing.

Related papers

Interpretable Language Modeling via Induction-head Ngram Models [74.26720927767398]
We propose Induction-head ngram models (Induction-Gram) to bolster modern ngram models with a hand-engineered "induction head" This induction head uses a custom neural similarity metric to efficiently search the model's input context for potential next-word completions. Experiments show that this simple method significantly improves next-word prediction over baseline interpretable models.
arXiv Detail & Related papers (2024-10-31T12:33:26Z)
Improving Neuron-level Interpretability with White-box Language Models [11.898535906016907]
We introduce a white-box transformer-like architecture named Coding RAte TransformEr (CRATE)<n>Our comprehensive experiments showcase significant improvements (up to 103% relative improvement) in neuron-level interpretability.<n>CRATE's increased interpretability comes from its enhanced ability to consistently and distinctively activate on relevant tokens.
arXiv Detail & Related papers (2024-10-21T19:12:33Z)
Analysis of Argument Structure Constructions in a Deep Recurrent Language Model [0.0]
We explore the representation and processing of Argument Structure Constructions (ASCs) in a recurrent neural language model. Our results show that sentence representations form distinct clusters corresponding to the four ASCs across all hidden layers. This indicates that even a relatively simple, brain-constrained recurrent neural network can effectively differentiate between various construction types.
arXiv Detail & Related papers (2024-08-06T09:27:41Z)
Nonlinear classification of neural manifolds with contextual information [6.292933471495322]
We introduce a theoretical framework that leverages latent directions in input space, which can be related to contextual information.<n>We derive an exact formula for the context-dependent manifold capacity that depends on manifold geometry and context correlations.<n>Our framework's increased expressivity captures representation reformatting in deep networks at early stages of the layer hierarchy, previously inaccessible to analysis.
arXiv Detail & Related papers (2024-05-10T23:37:31Z)
In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL) We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z)
How Graph Neural Networks Learn: Lessons from Training Dynamics [80.41778059014393]
We study the training dynamics in function space of graph neural networks (GNNs) We find that the gradient descent optimization of GNNs implicitly leverages the graph structure to update the learned function. This finding offers new interpretable insights into when and why the learned GNN functions generalize.
arXiv Detail & Related papers (2023-10-08T10:19:56Z)
N2G: A Scalable Approach for Quantifying Interpretable Neuron Representations in Large Language Models [0.0]
N2G is a tool which takes a neuron and its dataset examples, and automatically distills the neuron's behaviour on those examples to an interpretable graph. We use truncation and saliency methods to only present the important tokens, and augment the dataset examples with more diverse samples to better capture the extent of neuron behaviour. These graphs can be visualised to aid manual interpretation by researchers, but can also output token activations on text to compare to the neuron's ground truth activations for automatic validation.
arXiv Detail & Related papers (2023-04-22T19:06:13Z)
The Causal Neural Connection: Expressiveness, Learnability, and Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation. In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models. We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z)
The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain. In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z)
Analyzing Individual Neurons in Pre-trained Language Models [41.07850306314594]
We find small subsets of neurons to predict linguistic tasks, with lower level tasks localized in fewer neurons, compared to higher level task of predicting syntax. For example, we found neurons in XLNet to be more localized and disjoint when predicting properties compared to BERT and others, where they are more distributed and coupled.
arXiv Detail & Related papers (2020-10-06T13:17:38Z)
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution. Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning [134.77207192945053]
Prior methods learn the neural-symbolic models using reinforcement learning approaches. We introduce the textbfgrammar model as a textitsymbolic prior to bridge neural perception and symbolic reasoning. We propose a novel textbfback-search algorithm which mimics the top-down human-like learning procedure to propagate the error.
arXiv Detail & Related papers (2020-06-11T17:42:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.