Learning Semantic Program Embeddings with Graph Interval Neural Network
- URL: http://arxiv.org/abs/2005.09997v2
- Date: Wed, 27 May 2020 02:11:25 GMT
- Title: Learning Semantic Program Embeddings with Graph Interval Neural Network
- Authors: Yu Wang, Fengjuan Gao, Linzhang Wang, Ke Wang
- Abstract summary: We present a new graph neural architecture, called Graph Interval Neural Network (GINN) to tackle the weaknesses of the existing GNN.
GINN generalizes from a curated graph representation obtained through an abstraction method designed to aid models to learn.
We have created a neural bug detector based on GINN to catch null pointer deference bugs in Java code.
- Score: 7.747173589929493
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning distributed representations of source code has been a challenging
task for machine learning models. Earlier works treated programs as text so
that natural language methods can be readily applied. Unfortunately, such
approaches do not capitalize on the rich structural information possessed by
source code. Of late, Graph Neural Network (GNN) was proposed to learn
embeddings of programs from their graph representations. Due to the homogeneous
and expensive message-passing procedure, GNN can suffer from precision issues,
especially when dealing with programs rendered into large graphs. In this
paper, we present a new graph neural architecture, called Graph Interval Neural
Network (GINN), to tackle the weaknesses of the existing GNN. Unlike the
standard GNN, GINN generalizes from a curated graph representation obtained
through an abstraction method designed to aid models to learn. In particular,
GINN focuses exclusively on intervals for mining the feature representation of
a program, furthermore, GINN operates on a hierarchy of intervals for scaling
the learning to large graphs. We evaluate GINN for two popular downstream
applications: variable misuse prediction and method name prediction. Results
show in both cases GINN outperforms the state-of-the-art models by a
comfortable margin. We have also created a neural bug detector based on GINN to
catch null pointer deference bugs in Java code. While learning from the same
9,000 methods extracted from 64 projects, GINN-based bug detector significantly
outperforms GNN-based bug detector on 13 unseen test projects. Next, we deploy
our trained GINN-based bug detector and Facebook Infer to scan the codebase of
20 highly starred projects on GitHub. Through our manual inspection, we confirm
38 bugs out of 102 warnings raised by GINN-based bug detector compared to 34
bugs out of 129 warnings for Facebook Infer.
Related papers
- GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization [50.009407518866965]
Repository-level bug localization is a critical software engineering challenge.<n>GNNs offer a promising alternative due to their ability to model complex, repository-wide dependencies.<n>We introduce GREPO, the first GNN benchmark for repository-scale bug localization tasks.
arXiv Detail & Related papers (2026-02-14T23:22:15Z) - Distributed Graph Neural Network Inference With Just-In-Time Compilation For Industry-Scale Graphs [6.924892368183222]
Graph neural networks (GNNs) have delivered remarkable results in various fields.
The rapid increase in the scale of graph data has introduced significant performance bottlenecks for GNN inference.
This paper introduces an innovative processing paradgim for distributed graph learning that abstracts GNNs with a new set of programming interfaces.
arXiv Detail & Related papers (2025-03-08T13:26:59Z) - Graph Structure Prompt Learning: A Novel Methodology to Improve Performance of Graph Neural Networks [13.655670509818144]
We propose a novel Graph structure Prompt Learning method (GPL) to enhance the training of Graph networks (GNNs)
GPL employs task-independent graph structure losses to encourage GNNs to learn intrinsic graph characteristics while simultaneously solving downstream tasks.
In experiments on eleven real-world datasets, after being trained by neural prediction, GNNs significantly outperform their original performance on node classification, graph classification, and edge tasks.
arXiv Detail & Related papers (2024-07-16T03:59:18Z) - LazyGNN: Large-Scale Graph Neural Networks via Lazy Propagation [51.552170474958736]
We propose to capture long-distance dependency in graphs by shallower models instead of deeper models, which leads to a much more efficient model, LazyGNN, for graph representation learning.
LazyGNN is compatible with existing scalable approaches (such as sampling methods) for further accelerations through the development of mini-batch LazyGNN.
Comprehensive experiments demonstrate its superior prediction performance and scalability on large-scale benchmarks.
arXiv Detail & Related papers (2023-02-03T02:33:07Z) - Geodesic Graph Neural Network for Efficient Graph Representation
Learning [34.047527874184134]
We propose an efficient GNN framework called Geodesic GNN (GDGNN)
It injects conditional relationships between nodes into the model without labeling.
Conditioned on the geodesic representations, GDGNN is able to generate node, link, and graph representations that carry much richer structural information than plain GNNs.
arXiv Detail & Related papers (2022-10-06T02:02:35Z) - GARNET: Reduced-Rank Topology Learning for Robust and Scalable Graph
Neural Networks [15.448462928073635]
Graph neural networks (GNNs) have been increasingly deployed in various applications that involve learning on non-Euclidean data.
Recent studies show that GNNs are vulnerable to graph adversarial attacks.
We propose GARNET, a scalable spectral method to boost the adversarial robustness of GNN models.
arXiv Detail & Related papers (2022-01-30T06:32:44Z) - Customizing Graph Neural Networks using Path Reweighting [23.698877985105312]
We propose a novel GNN solution, namely Customized Graph Neural Network with Path Reweighting (CustomGNN for short)
Specifically, the proposed CustomGNN can automatically learn the high-level semantics for specific downstream tasks to highlight semantically relevant paths as well to filter out task-irrelevant noises in a graph.
In experiments with the node classification task, CustomGNN achieves state-of-the-art accuracies on three standard graph datasets and four large graph datasets.
arXiv Detail & Related papers (2021-06-21T05:38:26Z) - Combining Label Propagation and Simple Models Out-performs Graph Neural
Networks [52.121819834353865]
We show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs.
We call this overall procedure Correct and Smooth (C&S)
Our approach exceeds or nearly matches the performance of state-of-the-art GNNs on a wide variety of benchmarks.
arXiv Detail & Related papers (2020-10-27T02:10:52Z) - Learning to Execute Programs with Instruction Pointer Attention Graph
Neural Networks [55.98291376393561]
Graph neural networks (GNNs) have emerged as a powerful tool for learning software engineering tasks.
Recurrent neural networks (RNNs) are well-suited to long sequential chains of reasoning, but they do not naturally incorporate program structure.
We introduce a novel GNN architecture, the Instruction Pointer Attention Graph Neural Networks (IPA-GNN), which improves systematic generalization on the task of learning to execute programs.
arXiv Detail & Related papers (2020-10-23T19:12:30Z) - Distance Encoding: Design Provably More Powerful Neural Networks for
Graph Representation Learning [63.97983530843762]
Graph Neural Networks (GNNs) have achieved great success in graph representation learning.
GNNs generate identical representations for graph substructures that may in fact be very different.
More powerful GNNs, proposed recently by mimicking higher-order tests, are inefficient as they cannot sparsity of underlying graph structure.
We propose Distance Depiction (DE) as a new class of graph representation learning.
arXiv Detail & Related papers (2020-08-31T23:15:40Z) - Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs.
In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings.
We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z) - XGNN: Towards Model-Level Explanations of Graph Neural Networks [113.51160387804484]
Graphs neural networks (GNNs) learn node features by aggregating and combining neighbor information.
GNNs are mostly treated as black-boxes and lack human intelligible explanations.
We propose a novel approach, known as XGNN, to interpret GNNs at the model-level.
arXiv Detail & Related papers (2020-06-03T23:52:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.