Related papers: Analyzing Key Neurons in Large Language Models

Analyzing Key Neurons in Large Language Models

URL: http://arxiv.org/abs/2406.10868v1
Date: Sun, 16 Jun 2024 09:36:32 GMT
Title: Analyzing Key Neurons in Large Language Models
Authors: Lihu Chen, Adam Dejl, Francesca Toni,
Abstract summary: Large Language Models (LLMs) possess vast amounts of knowledge within their parameters, prompting research into methods for locating and editing this knowledge. We introduce Neuron-Inverse Cluster Attribution (NA-ICA), a novel architecture-agnostic framework capable of identifying key neurons in LLMs.
Score: 14.69046890281591
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) possess vast amounts of knowledge within their parameters, prompting research into methods for locating and editing this knowledge. Previous investigations have primarily focused on fill-in-the-blank tasks and locating entity-related usually single-token facts) information in relatively small-scale language models. However, several key questions remain unanswered: (1) How can we effectively locate query-relevant neurons in contemporary autoregressive LLMs, such as LLaMA and Mistral? (2) How can we address the challenge of long-form text generation? (3) Are there localized knowledge regions in LLMs? In this study, we introduce Neuron Attribution-Inverse Cluster Attribution (NA-ICA), a novel architecture-agnostic framework capable of identifying key neurons in LLMs. NA-ICA allows for the examination of long-form answers beyond single tokens by employing the proxy task of multi-choice question answering. To evaluate the effectiveness of our detected key neurons, we construct two multi-choice QA datasets spanning diverse domains and languages. Empirical evaluations demonstrate that NA-ICA outperforms baseline methods significantly. Moreover, analysis of neuron distributions reveals the presence of visible localized regions, particularly within different domains. Finally, we demonstrate the potential applications of our detected key neurons in knowledge editing and neuron-based prediction.

Related papers

How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective [64.79894853375478]
We propose a new finer-grained neuron identification algorithm, which detects language neurons(including language-specific neurons and language-related neurons) and language-agnostic neurons.<n>Based on the distributional characteristics of different types of neurons, we divide the LLMs' internal process for multilingual inference into four parts.<n>We systematically analyze the models before and after alignment with a focus on different types of neurons.
arXiv Detail & Related papers (2025-05-27T17:59:52Z)
Reasoning Capabilities and Invariability of Large Language Models [49.23570751696334]
We aim to provide a comprehensive analysis of Large Language Models' reasoning competence.<n>We introduce a new benchmark dataset with a series of simple reasoning questions demanding shallow logical reasoning.<n>An empirical analysis involving zero-shot and few-shot prompting across 24 LLMs of different sizes reveals that, while LLMs with over 70 billion parameters perform better in the zero-shot setting, there is still a large room for improvement.
arXiv Detail & Related papers (2025-05-01T18:12:30Z)
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research [57.61445960384384]
MicroVQA consists of 1,042 multiple-choice questions (MCQs) curated by biology experts across diverse microscopy modalities. Benchmarking on state-of-the-art MLLMs reveal a peak performance of 53%. Expert analysis of chain-of-thought responses shows perception errors are the most frequent, followed by knowledge errors and then overgeneralization errors.
arXiv Detail & Related papers (2025-03-17T17:33:10Z)
The Knowledge Microscope: Features as Better Analytical Lenses than Neurons [15.883209651151155]
Previous studies primarily utilize neurons as units of analysis for understanding the mechanisms of factual knowledge in Language Models (LMs) In this paper, we first conduct preliminary experiments to validate that Sparse Autoencoders (SAE) can effectively decompose neurons into features, which serve as alternative analytical units.
arXiv Detail & Related papers (2025-02-18T03:09:55Z)
One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models [19.58983929459173]
Large language models (LLMs) have learned vast amounts of factual knowledge through self-supervised pre-training on large-scale corpora. LLMs have also demonstrated excellent multilingual capabilities, which can express the learned knowledge in multiple languages.
arXiv Detail & Related papers (2024-11-26T13:03:49Z)
Multilingual Knowledge Editing with Language-Agnostic Factual Neurons [98.73585104789217]
We investigate how large language models (LLMs) represent multilingual factual knowledge. We find that the same factual knowledge in different languages generally activates a shared set of neurons, which we call language-agnostic factual neurons. Inspired by this finding, we propose a new MKE method by locating and modifying Language-Agnostic Factual Neurons (LAFN) to simultaneously edit multilingual knowledge.
arXiv Detail & Related papers (2024-06-24T08:06:56Z)
Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages. We classify neurons into four distinct categories based on their responses to a specific input across different languages. Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z)
M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering [14.198330378235632]
We use Multiple Choice and Abstractive Question Answering to conduct a large-scale empirical study on 22 datasets in three generalist and three specialist biomedical sub-domains. Our multifaceted analysis of the performance of 15 LLMs uncovers success factors such as instruction tuning that lead to improved recall and comprehension. We show that while recently proposed domain-adapted models may lack adequate knowledge, directly fine-tuning on our collected medical knowledge datasets shows encouraging results. We complement the quantitative results with a skill-oriented manual error analysis, which reveals a significant gap between the models' capabilities to simply recall necessary knowledge and to integrate it with the presented
arXiv Detail & Related papers (2024-06-06T02:43:21Z)
Crafting Interpretable Embeddings by Asking LLMs Questions [89.49960984640363]
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks. We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM. We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli.
arXiv Detail & Related papers (2024-05-26T22:30:29Z)
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models [117.20416338476856]
Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora. We propose a novel detection method, language activation probability entropy (LAPE), to identify language-specific neurons within LLMs. Our findings indicate that LLMs' proficiency in processing a particular language is predominantly due to a small subset of neurons.
arXiv Detail & Related papers (2024-02-26T09:36:05Z)
Unveiling A Core Linguistic Region in Large Language Models [49.860260050718516]
This paper conducts an analogical research using brain localization as a prototype. We have discovered a core region in large language models that corresponds to linguistic competence. We observe that an improvement in linguistic competence does not necessarily accompany an elevation in the model's knowledge level.
arXiv Detail & Related papers (2023-10-23T13:31:32Z)
Discovering Salient Neurons in Deep NLP Models [31.18937787704794]
We present a technique called as Linguistic Correlation Analysis to extract salient neurons in the model. Our data-driven, quantitative analysis illuminates interesting findings. Our code is publicly available as part of the NeuroX toolkit.
arXiv Detail & Related papers (2022-06-27T13:31:49Z)
Scalable Query Answering under Uncertainty to Neuroscientific Ontological Knowledge: The NeuroLang Approach [2.216657815393579]
Researchers in neuroscience have a growing number of datasets available to study the brain. There is currently no unifying framework for accessing such collections of rich heterogeneous data under uncertainty. We present NeuroLang, an ontology language with existential rules, probabilistic uncertainty, and built-in mechanisms to guarantee tractable query answering over very large datasets.
arXiv Detail & Related papers (2022-02-23T07:34:03Z)
Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization. We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise. We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.