PCART: Automated Repair of Python API Parameter Compatibility Issues
- URL: http://arxiv.org/abs/2406.03839v1
- Date: Thu, 6 Jun 2024 08:15:12 GMT
- Title: PCART: Automated Repair of Python API Parameter Compatibility Issues
- Authors: Shuai Zhang, Guanping Xiao, Jun Wang, Huashan Lei, Yepang Liu, Yulei Sui, Zheng Zheng,
- Abstract summary: Python third-party libraries have become crucial, particularly in fields such as deep learning and scientific computing.
The parameters of APIs in third-party libraries often change during evolution, causing compatibility issues for client applications that depend on specific versions.
No tool is capable of automatically detecting and repairing Python API parameter compatibility issues.
- Score: 17.2223738707004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In modern software development, Python third-party libraries have become crucial, particularly due to their widespread use in fields such as deep learning and scientific computing. However, the parameters of APIs in third-party libraries often change during evolution, causing compatibility issues for client applications that depend on specific versions. Due to Python's flexible parameter-passing mechanism, different methods of parameter passing can result in different API compatibility. Currently, no tool is capable of automatically detecting and repairing Python API parameter compatibility issues. To fill this gap, we propose PCART, the first to implement a fully automated process from API extraction, code instrumentation, and API mapping establishment, to compatibility assessment, and finally to repair and validation, for solving various types of Python API parameter compatibility issues, i.e., parameter addition, removal, renaming, reordering of parameters, as well as the conversion of positional parameters to keyword parameters. We construct a large-scale benchmark PCBENCH, including 47,478 test cases mutated from 844 parameter-changed APIs of 33 popular Python libraries, to evaluate PCART. The evaluation results show that PCART is effective yet efficient, significantly outperforming existing tools (MLCatchUp and Relancer) and the large language model ChatGPT-4, achieving an F-measure of 96.49% in detecting API parameter compatibility issues and a repair accuracy of 91.36%. The evaluation on 14 real-world Python projects from GitHub further demonstrates that PCART has good practicality. We believe PCART can help programmers reduce the time spent on maintaining Python API updates and facilitate automated Python API compatibility issue repair.
Related papers
- A Large-scale Investigation of Semantically Incompatible APIs behind Compatibility Issues in Android Apps [13.24503570840706]
We conduct a large-scale discovery of incompatible APIs in the Android Open Source Project (AOSP)
We propose a unified framework to detect incompatible APIs, especially for semantic changes.
Our approach detects 5,481 incompatible APIs spanning from version 4 to version 33.
arXiv Detail & Related papers (2024-06-25T10:12:37Z) - Exception-aware Lifecycle Model Construction for Framework APIs [4.333061751230906]
This paper adopts a static analysis technique to extract exception summary information in the framework API code.
It generates exception-aware API lifecycle models for the given framework/library project.
Compared to the exception-unaware API lifecycle modeling on 60 versions, JavaExp can identify 18% times more API changes.
arXiv Detail & Related papers (2024-01-05T06:35:47Z) - PyBADS: Fast and robust black-box optimization in Python [11.4219428942199]
PyBADS is an implementation of the Adaptive Direct Search (BADS) algorithm for fast and robust black-box optimization.
It comes along with an easy-to-use Python interface for running the algorithm for running the results.
arXiv Detail & Related papers (2023-06-27T15:54:44Z) - Scalable and Precise Application-Centered Call Graph Construction for Python [10.549200851675826]
PyCG is the state-of-the-art approach for constructing call graphs for Python programs.
We propose a scalable and precise approach for constructing application-centered call graphs for Python programs, and implement it as a prototype tool JARVIS.
Taking one function as an input, JARVIS generates the call graph on-the-fly, where flow-sensitive intra-procedural analysis and inter-procedural analysis are conducted.
arXiv Detail & Related papers (2023-05-10T07:40:05Z) - PyHopper -- Hyperparameter optimization [51.40201315676902]
We present PyHopper, a black-box optimization platform for machine learning researchers.
PyHopper's goal is to integrate with existing code with minimal effort and run the optimization process with minimal necessary manual oversight.
With simplicity as the primary theme, PyHopper is powered by a single robust Markov-chain Monte-Carlo optimization algorithm.
arXiv Detail & Related papers (2022-10-10T14:35:01Z) - Repairing Bugs in Python Assignments Using Large Language Models [9.973714032271708]
We propose to use a large language model trained on code to build an APR system for programming assignments.
Our system can fix both syntactic and semantic mistakes by combining multi-modal prompts, iterative querying, test-case-based selection of few-shots, and program chunking.
We evaluate MMAPR on 286 real student programs and compare to a baseline built by combining a state-of-the-art Python syntax repair engine, BIFI, and state-of-the-art Python semantic repair engine for student assignments, Refactory.
arXiv Detail & Related papers (2022-09-29T15:41:17Z) - PyGOD: A Python Library for Graph Outlier Detection [56.33769221859135]
PyGOD is an open-source library for detecting outliers in graph data.
It supports a wide array of leading graph-based methods for outlier detection.
PyGOD is released under a BSD 2-Clause license at https://pygod.org and at the Python Package Index (PyPI)
arXiv Detail & Related papers (2022-04-26T06:15:21Z) - PyHHMM: A Python Library for Heterogeneous Hidden Markov Models [63.01207205641885]
PyHHMM is an object-oriented Python implementation of Heterogeneous-Hidden Markov Models (HHMMs)
PyHHMM emphasizes features not supported in similar available frameworks: a heterogeneous observation model, missing data inference, different model order selection criterias, and semi-supervised training.
PyHHMM relies on the numpy, scipy, scikit-learn, and seaborn Python packages, and is distributed under the Apache-2.0 License.
arXiv Detail & Related papers (2022-01-12T07:32:36Z) - PyHealth: A Python Library for Health Predictive Models [53.848478115284195]
PyHealth is an open-source Python toolbox for developing various predictive models on healthcare data.
The data preprocessing module enables the transformation of complex healthcare datasets into machine learning friendly formats.
The predictive modeling module provides more than 30 machine learning models, including established ensemble trees and deep neural network-based approaches.
arXiv Detail & Related papers (2021-01-11T22:02:08Z) - MOGPTK: The Multi-Output Gaussian Process Toolkit [71.08576457371433]
We present MOGPTK, a Python package for multi-channel data modelling using Gaussian processes (GP)
The aim of this toolkit is to make multi-output GP (MOGP) models accessible to researchers, data scientists, and practitioners alike.
arXiv Detail & Related papers (2020-02-09T23:34:49Z) - OPFython: A Python-Inspired Optimum-Path Forest Classifier [68.8204255655161]
This paper proposes a Python-based Optimum-Path Forest framework, denoted as OPFython.
As OPFython is a Python-based library, it provides a more friendly environment and a faster prototyping workspace than the C language.
arXiv Detail & Related papers (2020-01-28T15:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.