Mining Program Properties From Neural Networks Trained on Source Code
Embeddings
- URL: http://arxiv.org/abs/2103.05442v1
- Date: Tue, 9 Mar 2021 14:25:16 GMT
- Title: Mining Program Properties From Neural Networks Trained on Source Code
Embeddings
- Authors: Martina Saletta, Claudio Ferretti
- Abstract summary: We propose a novel approach for mining different program features by analysing the internal behaviour of a deep neural network trained on source code.
We train an autoencoder for each program embedding and then we test the emerging ability of the internal neurons in autonomously building internal representations for different program features.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a novel approach for mining different program
features by analysing the internal behaviour of a deep neural network trained
on source code. Using an unlabelled dataset of Java programs and three
different embedding strategies for the methods in the dataset, we train an
autoencoder for each program embedding and then we test the emerging ability of
the internal neurons in autonomously building internal representations for
different program features. We defined three binary classification labelling
policies inspired by real programming issues, so to test the performance of
each neuron in classifying programs accordingly to these classification rules,
showing that some neurons can actually detect different program properties. We
also analyse how the program representation chosen as input affects the
performance on the aforementioned tasks. On the other hand, we are interested
in finding the overall most informative neurons in the network regardless of a
given task. To this aim, we propose and evaluate two methods for ranking
neurons independently of any property. Finally, we discuss how these ideas can
be applied in different settings for simplifying the programmers' work, for
instance if included in environments such as software repositories or code
editors.
Related papers
- Conditional computation in neural networks: principles and research trends [48.14569369912931]
This article summarizes principles and ideas from the emerging area of applying textitconditional computation methods to the design of neural networks.
In particular, we focus on neural networks that can dynamically activate or de-activate parts of their computational graph conditionally on their input.
arXiv Detail & Related papers (2024-03-12T11:56:38Z) - Redundancy and Concept Analysis for Code-trained Language Models [5.726842555987591]
Code-trained language models have proven to be highly effective for various code intelligence tasks.
They can be challenging to train and deploy for many software engineering applications due to computational bottlenecks and memory constraints.
We perform the first neuron-level analysis for source code models to identify textitimportant neurons within latent representations.
arXiv Detail & Related papers (2023-05-01T15:22:41Z) - Neuroevolutionary algorithms driven by neuron coverage metrics for
semi-supervised classification [60.60571130467197]
In some machine learning applications the availability of labeled instances for supervised classification is limited while unlabeled instances are abundant.
We introduce neuroevolutionary approaches that exploit unlabeled instances by using neuron coverage metrics computed on the neural network architecture encoded by each candidate solution.
arXiv Detail & Related papers (2023-03-05T23:38:44Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Learning Program Semantics with Code Representations: An Empirical Study [22.953964699210296]
Program semantics learning is the core and fundamental for various code intelligent tasks.
We categorize current mainstream code representation techniques into four categories.
We evaluate its performance on three diverse and popular code intelligent tasks.
arXiv Detail & Related papers (2022-03-22T14:51:44Z) - Action in Mind: A Neural Network Approach to Action Recognition and
Segmentation [0.0]
This thesis presents a novel computational approach for human action recognition through different implementations of multi-layer architectures based on artificial neural networks.
The proposed action recognition architecture is composed of several processing layers including a preprocessing layer, an ordered vector representation layer and three layers of neural networks.
For each level of development the system is trained with the input data consisting of consecutive 3D body postures and tested with generalized input data that the system has never met before.
arXiv Detail & Related papers (2021-04-30T09:53:28Z) - Training Binary Neural Networks through Learning with Noisy Supervision [76.26677550127656]
This paper formalizes the binarization operations over neural networks from a learning perspective.
Experimental results on benchmark datasets indicate that the proposed binarization technique attains consistent improvements over baselines.
arXiv Detail & Related papers (2020-10-10T01:59:39Z) - Neurocoder: Learning General-Purpose Computation Using Stored Neural
Programs [64.56890245622822]
Neurocoder is an entirely new class of general-purpose conditional computational machines.
It "codes" itself in a data-responsive way by composing relevant programs from a set of shareable, modular programs.
We show new capacity to learn modular programs, handle severe pattern shifts and remember old programs as new ones are learnt.
arXiv Detail & Related papers (2020-09-24T01:39:16Z) - On the Generalizability of Neural Program Models with respect to
Semantic-Preserving Program Transformations [25.96895574298886]
We evaluate the generalizability of neural program models with respect to semantic-preserving transformations.
We use three Java datasets of different sizes and three state-of-the-art neural network models for code.
Our results suggest that neural program models based on data and control dependencies in programs generalize better than neural program models based only on abstract syntax trees.
arXiv Detail & Related papers (2020-07-31T20:39:20Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.