Scalable and Precise Application-Centered Call Graph Construction for Python
- URL: http://arxiv.org/abs/2305.05949v5
- Date: Tue, 10 Sep 2024 02:15:04 GMT
- Title: Scalable and Precise Application-Centered Call Graph Construction for Python
- Authors: Kaifeng Huang, Yixuan Yan, Bihuan Chen, Zixin Tao, Xin Peng,
- Abstract summary: PyCG is the state-of-the-art approach for constructing call graphs for Python programs.
We propose a scalable and precise approach for constructing application-centered call graphs for Python programs, and implement it as a prototype tool JARVIS.
Taking one function as an input, JARVIS generates the call graph on-the-fly, where flow-sensitive intra-procedural analysis and inter-procedural analysis are conducted.
- Score: 4.655332013331494
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Call graph construction is the foundation of inter-procedural static analysis. PYCG is the state-of-the-art approach for constructing call graphs for Python programs. Unfortunately, PyCG does not scale to large programs when adapted to whole-program analysis where application and dependent libraries are both analyzed. Moreover, PyCG is flow-insensitive and does not fully support Python's features, hindering its accuracy. To overcome these drawbacks, we propose a scalable and precise approach for constructing application-centered call graphs for Python programs, and implement it as a prototype tool JARVIS. JARVIS maintains a type graph (i.e., type relations of program identifiers) for each function in a program to allow type inference. Taking one function as an input, JARVIS generates the call graph on-the-fly, where flow-sensitive intra-procedural analysis and inter-procedural analysis are conducted in turn and strong updates are conducted. Our evaluation on a micro-benchmark of 135 small Python programs and a macro-benchmark of 6 real-world Python applications has demonstrated that JARVIS can significantly improve PYCG by at least 67% faster in time, 84% higher in precision, and at least 20% higher in recall.
Related papers
- DyPyBench: A Benchmark of Executable Python Software [18.129031749321058]
We present DyPyBench, the first benchmark of Python projects that is large scale, diverse, ready to run and ready to analyze.
The benchmark encompasses 50 popular opensource projects from various application domains, with a total of 681k lines of Python code, and 30k test cases.
We envision DyPyBench to provide a basis for other dynamic analyses and for studying the runtime behavior of Python code.
arXiv Detail & Related papers (2024-03-01T13:53:15Z) - PyBADS: Fast and robust black-box optimization in Python [11.4219428942199]
PyBADS is an implementation of the Adaptive Direct Search (BADS) algorithm for fast and robust black-box optimization.
It comes along with an easy-to-use Python interface for running the algorithm for running the results.
arXiv Detail & Related papers (2023-06-27T15:54:44Z) - PyRCA: A Library for Metric-based Root Cause Analysis [66.72542200701807]
PyRCA is an open-source machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps)
It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents.
arXiv Detail & Related papers (2023-06-20T09:55:10Z) - RAGO: Recurrent Graph Optimizer For Multiple Rotation Averaging [62.315673415889314]
This paper proposes a deep recurrent Rotation Averaging Graph (RAGO) for Multiple Rotation Averaging (MRA)
Our framework is a real-time learning-to-optimize rotation averaging graph with a tiny size deployed for real-world applications.
arXiv Detail & Related papers (2022-12-14T13:19:40Z) - GraphQ IR: Unifying Semantic Parsing of Graph Query Language with
Intermediate Representation [91.27083732371453]
We propose a unified intermediate representation (IR) for graph query languages, namely GraphQ IR.
With the IR's natural-language-like representation that bridges the semantic gap and its formally defined syntax that maintains the graph structure, neural semantic parsing can more effectively convert user queries into GraphQ IR.
Our approach can consistently achieve state-of-the-art performance on KQA Pro, Overnight and MetaQA.
arXiv Detail & Related papers (2022-05-24T13:59:53Z) - PyGOD: A Python Library for Graph Outlier Detection [56.33769221859135]
PyGOD is an open-source library for detecting outliers in graph data.
It supports a wide array of leading graph-based methods for outlier detection.
PyGOD is released under a BSD 2-Clause license at https://pygod.org and at the Python Package Index (PyPI)
arXiv Detail & Related papers (2022-04-26T06:15:21Z) - Python for Smarter Cities: Comparison of Python libraries for static and
interactive visualisations of large vector data [0.0]
Python, with its concise and natural syntax, presents a low barrier to entry for municipal staff without computer science backgrounds.
This study assesses prominent, actively-developed visualisation libraries in the Python ecosystem with respect to producing visualisations of large vector datasets.
All short-listed libraries were able to generate the sample map products for both a small and larger dataset.
arXiv Detail & Related papers (2022-02-26T10:23:29Z) - PyLUSAT: An open-source Python toolkit for GIS-based land use
suitability analysis [0.1611401281366893]
This paper introduces PyLUSAT: Python for Land Use Suitability Analysis Tools.
PyLUSAT is an open-source software package that provides a series of tools to conduct various tasks in a suitability modeling workflow.
It was evaluated against comparable tools in ArcMap 10.4 with respect to both accuracy and computational efficiency.
arXiv Detail & Related papers (2021-07-04T16:19:16Z) - Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs.
In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings.
We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z) - OPFython: A Python-Inspired Optimum-Path Forest Classifier [68.8204255655161]
This paper proposes a Python-based Optimum-Path Forest framework, denoted as OPFython.
As OPFython is a Python-based library, it provides a more friendly environment and a faster prototyping workspace than the C language.
arXiv Detail & Related papers (2020-01-28T15:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.