Brain-Inspired Exploration of Functional Networks and Key Neurons in Large Language Models
- URL: http://arxiv.org/abs/2502.20408v1
- Date: Thu, 13 Feb 2025 04:42:39 GMT
- Title: Brain-Inspired Exploration of Functional Networks and Key Neurons in Large Language Models
- Authors: Yiheng Liu, Xiaohui Gao, Haiyang Sun, Bao Ge, Tianming Liu, Junwei Han, Xintao Hu,
- Abstract summary: We use methods similar to those in the field of functional neuroimaging analysis to locate and identify functional networks in large language models (LLMs)<n> Experimental results show that, similar to the human brain, LLMs contain functional networks that frequently recur during operation.<n>Masking key functional networks significantly impairs the model's performance, while retaining just a subset is adequate to maintain effective operation.
- Score: 53.91412558475662
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, the rapid advancement of large language models (LLMs) in natural language processing has sparked significant interest among researchers to understand their mechanisms and functional characteristics. Although existing studies have attempted to explain LLM functionalities by identifying and interpreting specific neurons, these efforts mostly focus on individual neuron contributions, neglecting the fact that human brain functions are realized through intricate interaction networks. Inspired by cognitive neuroscience research on functional brain networks (FBNs), this study introduces a novel approach to investigate whether similar functional networks exist within LLMs. We use methods similar to those in the field of functional neuroimaging analysis to locate and identify functional networks in LLM. Experimental results show that, similar to the human brain, LLMs contain functional networks that frequently recur during operation. Further analysis shows that these functional networks are crucial for LLM performance. Masking key functional networks significantly impairs the model's performance, while retaining just a subset of these networks is adequate to maintain effective operation. This research provides novel insights into the interpretation of LLMs and the lightweighting of LLMs for certain downstream tasks. Code is available at https://github.com/WhatAboutMyStar/LLM_ACTIVATION.
Related papers
- Identifying Good and Bad Neurons for Task-Level Controllable LLMs [43.20582224913806]
Large Language Models have demonstrated remarkable capabilities on multiple-choice question answering benchmarks.<n>The complex mechanisms underlying their large-scale neurons remain opaque, posing significant challenges for understanding and steering LLMs.<n>We propose NeuronLLM, a novel task-level LLM understanding framework that adopts the biological principle of functional antagonism for LLM neuron identification.
arXiv Detail & Related papers (2026-01-08T03:24:18Z) - Cognitive Mirrors: Exploring the Diverse Functional Roles of Attention Heads in LLM Reasoning [54.12174882424842]
Large language models (LLMs) have achieved state-of-the-art performance in a variety of tasks, but remain largely opaque in terms of their internal mechanisms.<n>We propose a novel interpretability framework to systematically analyze the roles and behaviors of attention heads.
arXiv Detail & Related papers (2025-12-03T10:24:34Z) - Pruning Large Language Models by Identifying and Preserving Functional Networks [41.601762545495255]
Structured pruning is a technique for compressing large language models (LLMs) to reduce GPU memory consumption and accelerate inference speed.<n>Most structured pruning methods overlook the interaction and collaboration among artificial neurons that are crucial for the functionalities of LLMs.<n>Inspired by the inherent similarities between artificial neural networks and functional neural networks in the human brain, we propose to prune LLMs by identifying and preserving functional networks.
arXiv Detail & Related papers (2025-08-07T10:27:01Z) - Probing Neural Topology of Large Language Models [12.298921317333452]
We introduce graph probing, a method for uncovering the functional connectivity of large language models.<n>By probing models across diverse LLM families and scales, we discover a universal predictability of next-token prediction performance.<n>Strikingly, probing on topology outperforms probing on activation by up to 130.4%.
arXiv Detail & Related papers (2025-06-01T14:57:03Z) - BrainMAP: Learning Multiple Activation Pathways in Brain Networks [77.15180533984947]
We introduce a novel framework BrainMAP to learn Multiple Activation Pathways in Brain networks.<n>Our framework enables explanatory analyses of crucial brain regions involved in tasks.
arXiv Detail & Related papers (2024-12-23T09:13:35Z) - Brain-like Functional Organization within Large Language Models [58.93629121400745]
The human brain has long inspired the pursuit of artificial intelligence (AI)
Recent neuroimaging studies provide compelling evidence of alignment between the computational representation of artificial neural networks (ANNs) and the neural responses of the human brain to stimuli.
In this study, we bridge this gap by directly coupling sub-groups of artificial neurons with functional brain networks (FBNs)
This framework links the AN sub-groups to FBNs, enabling the delineation of brain-like functional organization within large language models (LLMs)
arXiv Detail & Related papers (2024-10-25T13:15:17Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons [45.04661608619081]
We detect task-sensitive neurons in large language models (LLMs) via gradient attribution on task-specific data.
We find that the overlap of task-specific neurons is strongly associated with generalization and specialization across tasks.
We propose a neuron-level continuous fine-tuning method that only fine-tunes the current task-specific neurons during continuous learning.
arXiv Detail & Related papers (2024-07-09T01:27:35Z) - An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs [8.861378619584093]
Large language models (LLMs) have shown strong arithmetic reasoning capabilities when prompted with Chain-of-Thought prompts.
We investigate neuron activation'' as a lens to provide a unified explanation to observations made by prior work.
arXiv Detail & Related papers (2024-06-18T05:49:24Z) - Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages.
We classify neurons into four distinct categories based on their responses to a specific input across different languages.
Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z) - Contextual Feature Extraction Hierarchies Converge in Large Language
Models and the Brain [12.92793034617015]
We show that as large language models (LLMs) achieve higher performance on benchmark tasks, they become more brain-like.
We also show the importance of contextual information in improving model performance and brain similarity.
arXiv Detail & Related papers (2024-01-31T08:48:35Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z) - Estimating Reproducible Functional Networks Associated with Task
Dynamics using Unsupervised LSTMs [4.697267141773321]
We propose a method for estimating more reproducible functional networks associated with task activity by using recurrent neural networks with long short term memory (LSTM)
The LSTM model is trained in an unsupervised manner to generate the functional magnetic resonance imaging (fMRI) time-series data in regions of interest.
We demonstrate that the functional networks learned by the LSTM model are more strongly associated with the task activity and dynamics compared to other approaches.
arXiv Detail & Related papers (2021-05-06T17:53:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.