funcGNN: A Graph Neural Network Approach to Program Similarity
- URL: http://arxiv.org/abs/2007.13239v3
- Date: Thu, 30 Jul 2020 17:19:16 GMT
- Title: funcGNN: A Graph Neural Network Approach to Program Similarity
- Authors: Aravind Nair, Avijit Roy, Karl Meinke
- Abstract summary: FuncGNN is a graph neural network trained on labeled CFG pairs to predict the GED between unseen program pairs by utilizing an effective embedding vector.
This is the first time graph neural networks have been applied on labeled CFGs for estimating the similarity between high-level language programs.
- Score: 0.90238471756546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Program similarity is a fundamental concept, central to the solution of
software engineering tasks such as software plagiarism, clone identification,
code refactoring and code search. Accurate similarity estimation between
programs requires an in-depth understanding of their structure, semantics and
flow. A control flow graph (CFG), is a graphical representation of a program
which captures its logical control flow and hence its semantics. A common
approach is to estimate program similarity by analysing CFGs using graph
similarity measures, e.g. graph edit distance (GED). However, graph edit
distance is an NP-hard problem and computationally expensive, making the
application of graph similarity techniques to complex software programs
impractical. This study intends to examine the effectiveness of graph neural
networks to estimate program similarity, by analysing the associated control
flow graphs. We introduce funcGNN, which is a graph neural network trained on
labeled CFG pairs to predict the GED between unseen program pairs by utilizing
an effective embedding vector. To our knowledge, this is the first time graph
neural networks have been applied on labeled CFGs for estimating the similarity
between high-level language programs. Results: We demonstrate the effectiveness
of funcGNN to estimate the GED between programs and our experimental analysis
demonstrates how it achieves a lower error rate (0.00194), with faster (23
times faster than the quickest traditional GED approximation method) and better
scalability compared with the state of the art methods. funcGNN posses the
inductive learning ability to infer program structure and generalise to unseen
programs. The graph embedding of a program proposed by our methodology could be
applied to several related software engineering problems (such as code
plagiarism and clone identification) thus opening multiple research directions.
Related papers
- Efficient Graph Similarity Computation with Alignment Regularization [7.143879014059894]
Graph similarity computation (GSC) is a learning-based prediction task using Graph Neural Networks (GNNs)
We show that high-quality learning can be attained with a simple yet powerful regularization technique, which we call the Alignment Regularization (AReg)
In the inference stage, the graph-level representations learned by the GNN encoder are directly used to compute the similarity score without using AReg again to speed up inference.
arXiv Detail & Related papers (2024-06-21T07:37:28Z) - Using Graph Neural Networks for Program Termination [0.0]
We move away from formal methods to embrace the nature of machine learning models.
We take advantage of the graph representation of programs by employing Graph Neural Networks.
We adapt the notions of attention and semantic segmentation, previously used for other application domains, to programs.
arXiv Detail & Related papers (2022-07-28T12:16:37Z) - End-to-end Mapping in Heterogeneous Systems Using Graph Representation
Learning [13.810753108848582]
We propose a unified, end-to-end, programmable graph representation learning framework.
It is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run best on a specific core.
In the evaluation, we demonstrate a maximum speedup of 6.42x compared to the thread-based execution, and 2.02x compared to the state-of-the-art technique.
arXiv Detail & Related papers (2022-04-25T22:13:13Z) - Towards Unsupervised Deep Graph Structure Learning [67.58720734177325]
We propose an unsupervised graph structure learning paradigm, where the learned graph topology is optimized by data itself without any external guidance.
Specifically, we generate a learning target from the original data as an "anchor graph", and use a contrastive loss to maximize the agreement between the anchor graph and the learned graph.
arXiv Detail & Related papers (2022-01-17T11:57:29Z) - Training Free Graph Neural Networks for Graph Matching [103.45755859119035]
TFGM is a framework to boost the performance of Graph Neural Networks (GNNs) based graph matching without training.
Applying TFGM on various GNNs shows promising improvements over baselines.
arXiv Detail & Related papers (2022-01-14T09:04:46Z) - CogDL: A Comprehensive Library for Graph Deep Learning [55.694091294633054]
We present CogDL, a library for graph deep learning that allows researchers and practitioners to conduct experiments, compare methods, and build applications with ease and efficiency.
In CogDL, we propose a unified design for the training and evaluation of GNN models for various graph tasks, making it unique among existing graph learning libraries.
We develop efficient sparse operators for CogDL, enabling it to become the most competitive graph library for efficiency.
arXiv Detail & Related papers (2021-03-01T12:35:16Z) - Learning to Execute Programs with Instruction Pointer Attention Graph
Neural Networks [55.98291376393561]
Graph neural networks (GNNs) have emerged as a powerful tool for learning software engineering tasks.
Recurrent neural networks (RNNs) are well-suited to long sequential chains of reasoning, but they do not naturally incorporate program structure.
We introduce a novel GNN architecture, the Instruction Pointer Attention Graph Neural Networks (IPA-GNN), which improves systematic generalization on the task of learning to execute programs.
arXiv Detail & Related papers (2020-10-23T19:12:30Z) - Beyond Graph Neural Networks with Lifted Relational Neural Networks [14.63152363481139]
We demonstrate a declarative differentiable programming framework based on the language of Lifted Neural Networks.
Small parameterized programs are used to encode learning.
We show how this idea can be used for an efficient encoding of a diverse range of advanced neural networks.
arXiv Detail & Related papers (2020-07-13T10:10:58Z) - GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [62.73470368851127]
Graph representation learning has emerged as a powerful technique for addressing real-world problems.
We design Graph Contrastive Coding -- a self-supervised graph neural network pre-training framework.
We conduct experiments on three graph learning tasks and ten graph datasets.
arXiv Detail & Related papers (2020-06-17T16:18:35Z) - Graph Representation Learning via Graphical Mutual Information
Maximization [86.32278001019854]
We propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations.
We develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder.
arXiv Detail & Related papers (2020-02-04T08:33:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.