Discovering Salient Neurons in Deep NLP Models
- URL: http://arxiv.org/abs/2206.13288v2
- Date: Sun, 14 Jan 2024 13:25:01 GMT
- Title: Discovering Salient Neurons in Deep NLP Models
- Authors: Nadir Durrani and Fahim Dalvi and Hassan Sajjad
- Abstract summary: We present a technique called as Linguistic Correlation Analysis to extract salient neurons in the model.
Our data-driven, quantitative analysis illuminates interesting findings.
Our code is publicly available as part of the NeuroX toolkit.
- Score: 31.18937787704794
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While a lot of work has been done in understanding representations learned
within deep NLP models and what knowledge they capture, little attention has
been paid towards individual neurons. We present a technique called as
Linguistic Correlation Analysis to extract salient neurons in the model, with
respect to any extrinsic property - with the goal of understanding how such a
knowledge is preserved within neurons. We carry out a fine-grained analysis to
answer the following questions: (i) can we identify subsets of neurons in the
network that capture specific linguistic properties? (ii) how localized or
distributed neurons are across the network? iii) how redundantly is the
information preserved? iv) how fine-tuning pre-trained models towards
downstream NLP tasks, impacts the learned linguistic knowledge? iv) how do
architectures vary in learning different linguistic properties? Our
data-driven, quantitative analysis illuminates interesting findings: (i) we
found small subsets of neurons that can predict different linguistic tasks, ii)
with neurons capturing basic lexical information (such as suffixation)
localized in lower most layers, iii) while those learning complex concepts
(such as syntactic role) predominantly in middle and higher layers, iii) that
salient linguistic neurons are relocated from higher to lower layers during
transfer learning, as the network preserve the higher layers for task specific
information, iv) we found interesting differences across pre-trained models,
with respect to how linguistic information is preserved within, and v) we found
that concept exhibit similar neuron distribution across different languages in
the multilingual transformer models. Our code is publicly available as part of
the NeuroX toolkit.
Related papers
- Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages.
We classify neurons into four distinct categories based on their responses to a specific input across different languages.
Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z) - Identification of Knowledge Neurons in Protein Language Models [0.0]
We identify and characterizing knowledge neurons, components that express understanding of key information.
We show that there is a high density of knowledge neurons in the key vector prediction networks of self-attention modules.
In the future, the types of knowledge captured by each neuron could be characterized.
arXiv Detail & Related papers (2023-12-17T17:23:43Z) - Investigating the Encoding of Words in BERT's Neurons using Feature
Textualization [11.943486282441143]
We propose a technique to produce representations of neurons in embedding word space.
We find that the produced representations can provide insights about the encoded knowledge in individual neurons.
arXiv Detail & Related papers (2023-11-14T15:21:49Z) - Same Neurons, Different Languages: Probing Morphosyntax in Multilingual
Pre-trained Models [84.86942006830772]
We conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar.
We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe.
arXiv Detail & Related papers (2022-05-04T12:22:31Z) - Natural Language Descriptions of Deep Visual Features [50.270035018478666]
We introduce a procedure that automatically labels neurons with open-ended, compositional, natural language descriptions.
We use MILAN for analysis, characterizing the distribution and importance of neurons selective for attribute, category, and relational information in vision models.
We also use MILAN for auditing, surfacing neurons sensitive to protected categories like race and gender in models trained on datasets intended to obscure these features.
arXiv Detail & Related papers (2022-01-26T18:48:02Z) - What do End-to-End Speech Models Learn about Speaker, Language and
Channel Information? A Layer-wise and Neuron-level Analysis [16.850888973106706]
We conduct a post-hoc functional interpretability analysis of pretrained speech models using the probing framework.
We analyze utterance-level representations of speech models trained for various tasks such as speaker recognition and dialect identification.
Our results reveal several novel findings, including: i) channel and gender information are distributed across the network, ii) the information is redundantly available in neurons with respect to a task, and iv) complex properties such as dialectal information are encoded only in the task-oriented pretrained network.
arXiv Detail & Related papers (2021-07-01T13:32:55Z) - How transfer learning impacts linguistic knowledge in deep NLP models? [22.035813865470956]
Deep NLP models learn non-trivial amount of linguistic knowledge, captured at different layers of the model.
We investigate how fine-tuning towards downstream NLP tasks impacts the learned linguistic knowledge.
arXiv Detail & Related papers (2021-05-31T17:43:57Z) - Analyzing Individual Neurons in Pre-trained Language Models [41.07850306314594]
We find small subsets of neurons to predict linguistic tasks, with lower level tasks localized in fewer neurons, compared to higher level task of predicting syntax.
For example, we found neurons in XLNet to be more localized and disjoint when predicting properties compared to BERT and others, where they are more distributed and coupled.
arXiv Detail & Related papers (2020-10-06T13:17:38Z) - Compositional Explanations of Neurons [52.71742655312625]
We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts.
We use this procedure to answer several questions on interpretability in models for vision and natural language processing.
arXiv Detail & Related papers (2020-06-24T20:37:05Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.