Related papers: Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection

URL: http://arxiv.org/abs/2202.12481v1
Date: Fri, 25 Feb 2022 03:35:45 GMT
Title: Multi-View Graph Representation for Programming Language Processing: An Investigation into Algorithm Detection
Authors: Ting Long, Yutong Xie, Xianyu Chen, Weinan Zhang, Qinxiang Cao, Yong Yu
Abstract summary: This paper proposes a multi-view graph (MVG) program representation method. MVG pays more attention to code semantics and simultaneously includes both data flow and control flow as multiple views. In experiments, MVG outperforms previous methods significantly.
Score: 35.81014952109471
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Program representation, which aims at converting program source code into vectors with automatically extracted features, is a fundamental problem in programming language processing (PLP). Recent work tries to represent programs with neural networks based on source code structures. However, such methods often focus on the syntax and consider only one single perspective of programs, limiting the representation power of models. This paper proposes a multi-view graph (MVG) program representation method. MVG pays more attention to code semantics and simultaneously includes both data flow and control flow as multiple views. These views are then combined and processed by a graph neural network (GNN) to obtain a comprehensive program representation that covers various aspects. We thoroughly evaluate our proposed MVG approach in the context of algorithm detection, an important and challenging subfield of PLP. Specifically, we use a public dataset POJ-104 and also construct a new challenging dataset ALG-109 to test our method. In experiments, MVG outperforms previous methods significantly, demonstrating our model's strong capability of representing source code.

Related papers

Masked Image Modeling: A Survey [73.21154550957898]
Masked image modeling emerged as a powerful self-supervised learning technique in computer vision. We construct a taxonomy and review the most prominent papers in recent years. We aggregate the performance results of various masked image modeling methods on the most popular datasets.
arXiv Detail & Related papers (2024-08-13T07:27:02Z)
Deep Graph Reprogramming [112.34663053130073]
"Deep graph reprogramming" is a model reusing task tailored for graph neural networks (GNNs) We propose an innovative Data Reprogramming paradigm alongside a Model Reprogramming paradigm.
arXiv Detail & Related papers (2023-04-28T02:04:29Z)
Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora. Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z)
On the Impact of Multiple Source Code Representations on Software Engineering Tasks -- An Empirical Study [4.049850026698639]
We modify an AST path-based approach to accept multiple representations as input to an attention-based model. We evaluate our approach on three tasks: Method Naming, Program Classification, and Clone Detection.
arXiv Detail & Related papers (2021-06-21T08:36:38Z)
Code2Image: Intelligent Code Analysis by Computer Vision Techniques and Application to Vulnerability Prediction [0.6091702876917281]
We present a novel method to represent source code as image while preserving semantic and syntactic properties. The method makes it possible to directly enter the resulting image representation of source codes into deep learning (DL) algorithms as input. We demonstrate feasibility and effectiveness of our method by realizing a vulnerability prediction use case over a public dataset.
arXiv Detail & Related papers (2021-05-07T09:10:20Z)
How to Design Sample and Computationally Efficient VQA Models [53.65668097847456]
We find that representing the text as probabilistic programs and images as object-level scene graphs best satisfy these desiderata. We extend existing models to leverage these soft programs and scene graphs to train on question answer pairs in an end-to-end manner.
arXiv Detail & Related papers (2021-03-22T01:48:16Z)
Enhancing Handwritten Text Recognition with N-gram sequence decomposition and Multitask Learning [36.69114677635806]
Current approaches in the field of Handwritten Text Recognition are predominately single task with unigram, character level target units. In our work, we utilize a Multi-task Learning scheme, training the model to perform decompositions of the target sequence with target units of different granularity. Our proposed model, even though evaluated only on the unigram task, outperforms its counterpart single-task by absolute 2.52% WER and 1.02% CER.
arXiv Detail & Related papers (2020-12-28T19:35:40Z)
funcGNN: A Graph Neural Network Approach to Program Similarity [0.90238471756546]
FuncGNN is a graph neural network trained on labeled CFG pairs to predict the GED between unseen program pairs by utilizing an effective embedding vector. This is the first time graph neural networks have been applied on labeled CFGs for estimating the similarity between high-level language programs.
arXiv Detail & Related papers (2020-07-26T23:16:24Z)
Improved Code Summarization via a Graph Neural Network [96.03715569092523]
In general, source code summarization techniques use the source code as input and outputs a natural language description. We present an approach that uses a graph-based neural architecture that better matches the default structure of the AST to generate these summaries.
arXiv Detail & Related papers (2020-04-06T17:36:42Z)
ProGraML: Graph-based Deep Learning for Program Optimization and Analysis [16.520971531754018]
We introduce ProGraML, a graph-based program representation for machine learning. ProGraML achieves an average 94.0 F1 score, significantly outperforming the state-of-the-art approaches. We then apply our approach to two high-level tasks - heterogeneous device mapping and program classification - setting new state-of-the-art performance in both.
arXiv Detail & Related papers (2020-03-23T20:27:00Z)
Weakly Supervised Visual Semantic Parsing [49.69377653925448]
Scene Graph Generation (SGG) aims to extract entities, predicates and their semantic structure from images. Existing SGG methods require millions of manually annotated bounding boxes for training. We propose Visual Semantic Parsing, VSPNet, and graph-based weakly supervised learning framework.
arXiv Detail & Related papers (2020-01-08T03:46:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.