Related papers: Brain-Inspired Exploration of Functional Networks and Key Neurons in Large Language Models

Brain-Inspired Exploration of Functional Networks and Key Neurons in Large Language Models

URL: http://arxiv.org/abs/2502.20408v1
Date: Thu, 13 Feb 2025 04:42:39 GMT
Title: Brain-Inspired Exploration of Functional Networks and Key Neurons in Large Language Models
Authors: Yiheng Liu, Xiaohui Gao, Haiyang Sun, Bao Ge, Tianming Liu, Junwei Han, Xintao Hu,
Abstract summary: We use methods similar to those in the field of functional neuroimaging analysis to locate and identify functional networks in large language models (LLMs)<n> Experimental results show that, similar to the human brain, LLMs contain functional networks that frequently recur during operation.<n>Masking key functional networks significantly impairs the model's performance, while retaining just a subset is adequate to maintain effective operation.
Score: 53.91412558475662
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, the rapid advancement of large language models (LLMs) in natural language processing has sparked significant interest among researchers to understand their mechanisms and functional characteristics. Although existing studies have attempted to explain LLM functionalities by identifying and interpreting specific neurons, these efforts mostly focus on individual neuron contributions, neglecting the fact that human brain functions are realized through intricate interaction networks. Inspired by cognitive neuroscience research on functional brain networks (FBNs), this study introduces a novel approach to investigate whether similar functional networks exist within LLMs. We use methods similar to those in the field of functional neuroimaging analysis to locate and identify functional networks in LLM. Experimental results show that, similar to the human brain, LLMs contain functional networks that frequently recur during operation. Further analysis shows that these functional networks are crucial for LLM performance. Masking key functional networks significantly impairs the model's performance, while retaining just a subset of these networks is adequate to maintain effective operation. This research provides novel insights into the interpretation of LLMs and the lightweighting of LLMs for certain downstream tasks. Code is available at https://github.com/WhatAboutMyStar/LLM_ACTIVATION.

Related papers

BrainMAP: Learning Multiple Activation Pathways in Brain Networks [77.15180533984947]
We introduce a novel framework BrainMAP to learn Multiple Activation Pathways in Brain networks.<n>Our framework enables explanatory analyses of crucial brain regions involved in tasks.
arXiv Detail & Related papers (2024-12-23T09:13:35Z)
Brain-like Functional Organization within Large Language Models [58.93629121400745]
The human brain has long inspired the pursuit of artificial intelligence (AI) Recent neuroimaging studies provide compelling evidence of alignment between the computational representation of artificial neural networks (ANNs) and the neural responses of the human brain to stimuli. In this study, we bridge this gap by directly coupling sub-groups of artificial neurons with functional brain networks (FBNs) This framework links the AN sub-groups to FBNs, enabling the delineation of brain-like functional organization within large language models (LLMs)
arXiv Detail & Related papers (2024-10-25T13:15:17Z)
Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making. Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations. Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z)
Towards Understanding Multi-Task Learning (Generalization) of LLMs via Detecting and Exploring Task-Specific Neurons [45.04661608619081]
We detect task-sensitive neurons in large language models (LLMs) via gradient attribution on task-specific data. We find that the overlap of task-specific neurons is strongly associated with generalization and specialization across tasks. We propose a neuron-level continuous fine-tuning method that only fine-tunes the current task-specific neurons during continuous learning.
arXiv Detail & Related papers (2024-07-09T01:27:35Z)
An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs [8.861378619584093]
Large language models (LLMs) have shown strong arithmetic reasoning capabilities when prompted with Chain-of-Thought prompts. We investigate neuron activation'' as a lens to provide a unified explanation to observations made by prior work.
arXiv Detail & Related papers (2024-06-18T05:49:24Z)
Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages. We classify neurons into four distinct categories based on their responses to a specific input across different languages. Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z)
Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain [12.92793034617015]
We show that as large language models (LLMs) achieve higher performance on benchmark tasks, they become more brain-like. We also show the importance of contextual information in improving model performance and brain similarity.
arXiv Detail & Related papers (2024-01-31T08:48:35Z)
Functional2Structural: Cross-Modality Brain Networks Representation Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder. We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z)
Estimating Reproducible Functional Networks Associated with Task Dynamics using Unsupervised LSTMs [4.697267141773321]
We propose a method for estimating more reproducible functional networks associated with task activity by using recurrent neural networks with long short term memory (LSTM) The LSTM model is trained in an unsupervised manner to generate the functional magnetic resonance imaging (fMRI) time-series data in regions of interest. We demonstrate that the functional networks learned by the LSTM model are more strongly associated with the task activity and dynamics compared to other approaches.
arXiv Detail & Related papers (2021-05-06T17:53:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.