Redundancy and Concept Analysis for Code-trained Language Models
- URL: http://arxiv.org/abs/2305.00875v2
- Date: Fri, 16 Feb 2024 04:21:53 GMT
- Title: Redundancy and Concept Analysis for Code-trained Language Models
- Authors: Arushi Sharma, Zefu Hu, Christopher Quinn, Ali Jannesari
- Abstract summary: Code-trained language models have proven to be highly effective for various code intelligence tasks.
They can be challenging to train and deploy for many software engineering applications due to computational bottlenecks and memory constraints.
We perform the first neuron-level analysis for source code models to identify textitimportant neurons within latent representations.
- Score: 5.726842555987591
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Code-trained language models have proven to be highly effective for various
code intelligence tasks. However, they can be challenging to train and deploy
for many software engineering applications due to computational bottlenecks and
memory constraints. Implementing effective strategies to address these issues
requires a better understanding of these 'black box' models. In this paper, we
perform the first neuron-level analysis for source code models to identify
\textit{important} neurons within latent representations. We achieve this by
eliminating neurons that are highly similar or irrelevant to the given task.
This approach helps us understand which neurons and layers can be eliminated
(redundancy analysis) and where important code properties are located within
the network (concept analysis). Using redundancy analysis, we make observations
relevant to knowledge transfer and model optimization applications. We find
that over 95\% of the neurons are redundant with respect to our code
intelligence tasks and can be eliminated without significant loss in accuracy.
We also discover several subsets of neurons that can make predictions with
baseline accuracy. Through concept analysis, we explore the traceability and
distribution of human-recognizable concepts within latent code representations
which could be used to influence model predictions. We trace individual and
subsets of important neurons to specific code properties and identify 'number'
neurons, 'string' neurons, and higher-level 'text' neurons for token-level
tasks and higher-level concepts important for sentence-level downstream tasks.
This also helps us understand how decomposable and transferable task-related
features are and can help devise better techniques for transfer learning, model
compression, and the decomposition of deep neural networks into modules.
Related papers
- Simple and Effective Transfer Learning for Neuro-Symbolic Integration [50.592338727912946]
A potential solution to this issue is Neuro-Symbolic Integration (NeSy), where neural approaches are combined with symbolic reasoning.
Most of these methods exploit a neural network to map perceptions to symbols and a logical reasoner to predict the output of the downstream task.
They suffer from several issues, including slow convergence, learning difficulties with complex perception tasks, and convergence to local minima.
This paper proposes a simple yet effective method to ameliorate these problems.
arXiv Detail & Related papers (2024-02-21T15:51:01Z) - Hebbian Learning based Orthogonal Projection for Continual Learning of
Spiking Neural Networks [74.3099028063756]
We develop a new method with neuronal operations based on lateral connections and Hebbian learning.
We show that Hebbian and anti-Hebbian learning on recurrent lateral connections can effectively extract the principal subspace of neural activities.
Our method consistently solves for spiking neural networks with nearly zero forgetting.
arXiv Detail & Related papers (2024-02-19T09:29:37Z) - Investigating the Encoding of Words in BERT's Neurons using Feature
Textualization [11.943486282441143]
We propose a technique to produce representations of neurons in embedding word space.
We find that the produced representations can provide insights about the encoded knowledge in individual neurons.
arXiv Detail & Related papers (2023-11-14T15:21:49Z) - Identifying Interpretable Visual Features in Artificial and Biological
Neural Systems [3.604033202771937]
Single neurons in neural networks are often interpretable in that they represent individual, intuitively meaningful features.
Many neurons exhibit $textitmixed selectivity$, i.e., they represent multiple unrelated features.
We propose an automated method for quantifying visual interpretability and an approach for finding meaningful directions in network activation space.
arXiv Detail & Related papers (2023-10-17T17:41:28Z) - Automated Natural Language Explanation of Deep Visual Neurons with Large
Models [43.178568768100305]
This paper proposes a novel post-hoc framework for generating semantic explanations of neurons with large foundation models.
Our framework is designed to be compatible with various model architectures and datasets, automated and scalable neuron interpretation.
arXiv Detail & Related papers (2023-10-16T17:04:51Z) - Implementing engrams from a machine learning perspective: matching for
prediction [0.0]
We propose how we might design a computer system to implement engrams using neural networks.
Building on autoencoders, we propose latent neural spaces as indexes for storing and retrieving information in a compressed format.
We consider how different states in latent neural spaces corresponding to different types of sensory input could be linked by synchronous activation.
arXiv Detail & Related papers (2023-03-01T10:05:40Z) - Constraints on the design of neuromorphic circuits set by the properties
of neural population codes [61.15277741147157]
In the brain, information is encoded, transmitted and used to inform behaviour.
Neuromorphic circuits need to encode information in a way compatible to that used by populations of neuron in the brain.
arXiv Detail & Related papers (2022-12-08T15:16:04Z) - Synergistic information supports modality integration and flexible
learning in neural networks solving multiple tasks [107.8565143456161]
We investigate the information processing strategies adopted by simple artificial neural networks performing a variety of cognitive tasks.
Results show that synergy increases as neural networks learn multiple diverse tasks.
randomly turning off neurons during training through dropout increases network redundancy, corresponding to an increase in robustness.
arXiv Detail & Related papers (2022-10-06T15:36:27Z) - POPPINS : A Population-Based Digital Spiking Neuromorphic Processor with
Integer Quadratic Integrate-and-Fire Neurons [50.591267188664666]
We propose a population-based digital spiking neuromorphic processor in 180nm process technology with two hierarchy populations.
The proposed approach enables the developments of biomimetic neuromorphic system and various low-power, and low-latency inference processing applications.
arXiv Detail & Related papers (2022-01-19T09:26:34Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.