Related papers: funcGNN: A Graph Neural Network Approach to Program Similarity

funcGNN: A Graph Neural Network Approach to Program Similarity

URL: http://arxiv.org/abs/2007.13239v3
Date: Thu, 30 Jul 2020 17:19:16 GMT
Title: funcGNN: A Graph Neural Network Approach to Program Similarity
Authors: Aravind Nair, Avijit Roy, Karl Meinke
Abstract summary: FuncGNN is a graph neural network trained on labeled CFG pairs to predict the GED between unseen program pairs by utilizing an effective embedding vector. This is the first time graph neural networks have been applied on labeled CFGs for estimating the similarity between high-level language programs.
Score: 0.90238471756546
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Program similarity is a fundamental concept, central to the solution of software engineering tasks such as software plagiarism, clone identification, code refactoring and code search. Accurate similarity estimation between programs requires an in-depth understanding of their structure, semantics and flow. A control flow graph (CFG), is a graphical representation of a program which captures its logical control flow and hence its semantics. A common approach is to estimate program similarity by analysing CFGs using graph similarity measures, e.g. graph edit distance (GED). However, graph edit distance is an NP-hard problem and computationally expensive, making the application of graph similarity techniques to complex software programs impractical. This study intends to examine the effectiveness of graph neural networks to estimate program similarity, by analysing the associated control flow graphs. We introduce funcGNN, which is a graph neural network trained on labeled CFG pairs to predict the GED between unseen program pairs by utilizing an effective embedding vector. To our knowledge, this is the first time graph neural networks have been applied on labeled CFGs for estimating the similarity between high-level language programs. Results: We demonstrate the effectiveness of funcGNN to estimate the GED between programs and our experimental analysis demonstrates how it achieves a lower error rate (0.00194), with faster (23 times faster than the quickest traditional GED approximation method) and better scalability compared with the state of the art methods. funcGNN posses the inductive learning ability to infer program structure and generalise to unseen programs. The graph embedding of a program proposed by our methodology could be applied to several related software engineering problems (such as code plagiarism and clone identification) thus opening multiple research directions.

Related papers

Efficient Graph Similarity Computation with Alignment Regularization [7.143879014059894]
Graph similarity computation (GSC) is a learning-based prediction task using Graph Neural Networks (GNNs) We show that high-quality learning can be attained with a simple yet powerful regularization technique, which we call the Alignment Regularization (AReg) In the inference stage, the graph-level representations learned by the GNN encoder are directly used to compute the similarity score without using AReg again to speed up inference.
arXiv Detail & Related papers (2024-06-21T07:37:28Z)
Using Graph Neural Networks for Program Termination [0.0]
We move away from formal methods to embrace the nature of machine learning models. We take advantage of the graph representation of programs by employing Graph Neural Networks. We adapt the notions of attention and semantic segmentation, previously used for other application domains, to programs.
arXiv Detail & Related papers (2022-07-28T12:16:37Z)
End-to-end Mapping in Heterogeneous Systems Using Graph Representation Learning [13.810753108848582]
We propose a unified, end-to-end, programmable graph representation learning framework. It is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run best on a specific core. In the evaluation, we demonstrate a maximum speedup of 6.42x compared to the thread-based execution, and 2.02x compared to the state-of-the-art technique.
arXiv Detail & Related papers (2022-04-25T22:13:13Z)
Towards Unsupervised Deep Graph Structure Learning [67.58720734177325]
We propose an unsupervised graph structure learning paradigm, where the learned graph topology is optimized by data itself without any external guidance. Specifically, we generate a learning target from the original data as an "anchor graph", and use a contrastive loss to maximize the agreement between the anchor graph and the learned graph.
arXiv Detail & Related papers (2022-01-17T11:57:29Z)
Training Free Graph Neural Networks for Graph Matching [103.45755859119035]
TFGM is a framework to boost the performance of Graph Neural Networks (GNNs) based graph matching without training. Applying TFGM on various GNNs shows promising improvements over baselines.
arXiv Detail & Related papers (2022-01-14T09:04:46Z)
CogDL: A Comprehensive Library for Graph Deep Learning [55.694091294633054]
We present CogDL, a library for graph deep learning that allows researchers and practitioners to conduct experiments, compare methods, and build applications with ease and efficiency. In CogDL, we propose a unified design for the training and evaluation of GNN models for various graph tasks, making it unique among existing graph learning libraries. We develop efficient sparse operators for CogDL, enabling it to become the most competitive graph library for efficiency.
arXiv Detail & Related papers (2021-03-01T12:35:16Z)
Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks [55.98291376393561]
Graph neural networks (GNNs) have emerged as a powerful tool for learning software engineering tasks. Recurrent neural networks (RNNs) are well-suited to long sequential chains of reasoning, but they do not naturally incorporate program structure. We introduce a novel GNN architecture, the Instruction Pointer Attention Graph Neural Networks (IPA-GNN), which improves systematic generalization on the task of learning to execute programs.
arXiv Detail & Related papers (2020-10-23T19:12:30Z)
Beyond Graph Neural Networks with Lifted Relational Neural Networks [14.63152363481139]
We demonstrate a declarative differentiable programming framework based on the language of Lifted Neural Networks. Small parameterized programs are used to encode learning. We show how this idea can be used for an efficient encoding of a diverse range of advanced neural networks.
arXiv Detail & Related papers (2020-07-13T10:10:58Z)
GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [62.73470368851127]
Graph representation learning has emerged as a powerful technique for addressing real-world problems. We design Graph Contrastive Coding -- a self-supervised graph neural network pre-training framework. We conduct experiments on three graph learning tasks and ten graph datasets.
arXiv Detail & Related papers (2020-06-17T16:18:35Z)
Graph Representation Learning via Graphical Mutual Information Maximization [86.32278001019854]
We propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations. We develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder.
arXiv Detail & Related papers (2020-02-04T08:33:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.