Related papers: AutoPruner: Transformer-Based Call Graph Pruning

AutoPruner: Transformer-Based Call Graph Pruning

URL: http://arxiv.org/abs/2209.03230v1
Date: Wed, 7 Sep 2022 15:35:28 GMT
Title: AutoPruner: Transformer-Based Call Graph Pruning
Authors: Thanh Le-Cong, Hong Jin Kang, Truong Giang Nguyen, Stefanus Agus Haryono, David Lo, Xuan-Bach D. Le, Huynh Quyet Thang
Abstract summary: We present a novel call graph pruning technique, AutoPruner, for eliminating false positives in call graphs via both statistical semantic and structural analysis. Our empirical evaluation on a benchmark dataset of real-world programs shows that AutoPruner outperforms the state-of-the-art baselines.
Score: 7.319973664340497
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Constructing a static call graph requires trade-offs between soundness and precision. Program analysis techniques for constructing call graphs are unfortunately usually imprecise. To address this problem, researchers have recently proposed call graph pruning empowered by machine learning to post-process call graphs constructed by static analysis. A machine learning model is built to capture information from the call graph by extracting structural features for use in a random forest classifier. It then removes edges that are predicted to be false positives. Despite the improvements shown by machine learning models, they are still limited as they do not consider the source code semantics and thus often are not able to effectively distinguish true and false positives. In this paper, we present a novel call graph pruning technique, AutoPruner, for eliminating false positives in call graphs via both statistical semantic and structural analysis. Given a call graph constructed by traditional static analysis tools, AutoPruner takes a Transformer-based approach to capture the semantic relationships between the caller and callee functions associated with each edge in the call graph. To do so, AutoPruner fine-tunes a model of code that was pre-trained on a large corpus to represent source code based on descriptions of its semantics. Next, the model is used to extract semantic features from the functions related to each edge in the call graph. AutoPruner uses these semantic features together with the structural features extracted from the call graph to classify each edge via a feed-forward neural network. Our empirical evaluation on a benchmark dataset of real-world programs shows that AutoPruner outperforms the state-of-the-art baselines, improving on F-measure by up to 13% in identifying false-positive edges in a static call graph.

Related papers

An Automatic Graph Construction Framework based on Large Language Models for Recommendation [49.51799417575638]
We introduce AutoGraph, an automatic graph construction framework based on large language models for recommendation. LLMs infer the user preference and item knowledge, which is encoded as semantic vectors. Latent factors are incorporated as extra nodes to link the user/item nodes, resulting in a graph with in-depth global-view semantics.
arXiv Detail & Related papers (2024-12-24T07:51:29Z)
Similarity-aware Positive Instance Sampling for Graph Contrastive Pre-training [82.68805025636165]
We propose to select positive graph instances directly from existing graphs in the training set. Our selection is based on certain domain-specific pair-wise similarity measurements. Besides, we develop an adaptive node-level pre-training method to dynamically mask nodes to distribute them evenly in the graph.
arXiv Detail & Related papers (2022-06-23T20:12:51Z)
Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning [84.35102534158621]
We study pre-trained language models that generate explanation graphs in an end-to-end manner. We propose simple yet effective ways of graph perturbations via node and edge edit operations. Our methods lead to significant improvements in both structural and semantic accuracy of explanation graphs.
arXiv Detail & Related papers (2022-04-11T00:58:27Z)
Multi-task Self-distillation for Graph-based Semi-Supervised Learning [6.277952154365413]
We propose a multi-task self-distillation framework that injects self-supervised learning and self-distillation into graph convolutional networks. First, we formulate a self-supervision pipeline based on pre-text tasks to capture different levels of similarities in graphs. Second, self-distillation uses soft labels of the model itself as additional supervision.
arXiv Detail & Related papers (2021-12-02T12:43:41Z)
Joint Graph Learning and Matching for Semantic Feature Correspondence [69.71998282148762]
We propose a joint emphgraph learning and matching network, named GLAM, to explore reliable graph structures for boosting graph matching. The proposed method is evaluated on three popular visual matching benchmarks (Pascal VOC, Willow Object and SPair-71k) It outperforms previous state-of-the-art graph matching methods by significant margins on all benchmarks.
arXiv Detail & Related papers (2021-09-01T08:24:02Z)
Temporal Graph Network Embedding with Causal Anonymous Walks Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network. For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings. We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z)
Partial Graph Reasoning for Neural Network Regularization [26.793648333908905]
Dropout,as a commonly used regularization technique, disables neuron ac-tivations during network optimization. We present DropGraph that learns regularization function by constructing a stand-alone graph from the backbone features. This add-on graph regularizes the network during training and can be completely skipped during inference.
arXiv Detail & Related papers (2021-06-03T12:57:01Z)
Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking [58.30147362745852]
Data association across frames is at the core of Multiple Object Tracking (MOT) task. Existing methods mostly ignore the context information among tracklets and intra-frame detections. We propose a novel learnable graph matching method to address these issues.
arXiv Detail & Related papers (2021-03-30T08:58:45Z)
Line Graph Neural Networks for Link Prediction [71.00689542259052]
We consider the graph link prediction task, which is a classic graph analytical problem with many real-world applications. In this formalism, a link prediction problem is converted to a graph classification task. We propose to seek a radically different and novel path by making use of the line graphs in graph theory. In particular, each node in a line graph corresponds to a unique edge in the original graph. Therefore, link prediction problems in the original graph can be equivalently solved as a node classification problem in its corresponding line graph, instead of a graph classification task.
arXiv Detail & Related papers (2020-10-20T05:54:31Z)
Learning Graph Structure With A Finite-State Automaton Layer [31.028101360041227]
We study the problem of learning to derive abstract relations from the intrinsic graph structure. We show how to learn these relations end-to-end by relaxing the problem into learning finite-state automata policies. We demonstrate that this layer can find shortcuts in grid-world graphs and reproduce simple static analyses on Python programs.
arXiv Detail & Related papers (2020-07-09T17:01:34Z)
Non-Parametric Graph Learning for Bayesian Graph Neural Networks [35.88239188555398]
We propose a novel non-parametric graph model for constructing the posterior distribution of graph adjacency matrices. We demonstrate the advantages of this model in three different problem settings: node classification, link prediction and recommendation.
arXiv Detail & Related papers (2020-06-23T21:10:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.