PCART: Automated Repair of Python API Parameter Compatibility Issues
- URL: http://arxiv.org/abs/2406.03839v3
- Date: Sun, 02 Mar 2025 12:21:40 GMT
- Title: PCART: Automated Repair of Python API Parameter Compatibility Issues
- Authors: Shuai Zhang, Guanping Xiao, Jun Wang, Huashan Lei, Gangqiang He, Yepang Liu, Zheng Zheng,
- Abstract summary: Python third-party libraries play a critical role, especially in fields like deep learning and scientific computing.<n>API parameters in these libraries often change during evolution, leading to compatibility issues for client applications reliant on specific versions.<n>No tool can automatically detect and repair Python API parameter compatibility issues.<n>PCART is the first solution to fully automate the process of API extraction, code instrumentation, API mapping establishment, compatibility assessment, repair, and validation.
- Score: 11.36053416670063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In modern software development, Python third-party libraries play a critical role, especially in fields like deep learning and scientific computing. However, API parameters in these libraries often change during evolution, leading to compatibility issues for client applications reliant on specific versions. Python's flexible parameter-passing mechanism further complicates this, as different passing methods can result in different API compatibility. Currently, no tool can automatically detect and repair Python API parameter compatibility issues. To fill this gap, we introduce PCART, the first solution to fully automate the process of API extraction, code instrumentation, API mapping establishment, compatibility assessment, repair, and validation. PCART handles various types of Python API parameter compatibility issues, including parameter addition, removal, renaming, reordering, and the conversion of positional to keyword parameters. To evaluate PCART, we construct PCBENCH, a large-scale benchmark comprising 47,478 test cases mutated from 844 parameter-changed APIs across 33 popular Python libraries. Evaluation results demonstrate that PCART is both effective and efficient, significantly outperforming existing tools (MLCatchUp and Relancer) and the large language model ChatGPT (GPT-4o), achieving an F1-score of 96.49% in detecting API parameter compatibility issues and a repair precision of 92.26%. Further evaluation on 30 real-world Python projects from GitHub confirms PCART's practicality. We believe PCART can significantly reduce the time programmers spend maintaining Python API updates and advance the automation of Python API compatibility issue repair.
Related papers
- PCREQ: Automated Inference of Compatible Requirements for Python Third-party Library Upgrades [5.857193811761703]
Python third-party libraries (TPLs) are essential in modern software development, but upgrades often cause compatibility issues, leading to system failures.<n>Existing tools mainly detect dependency conflicts but overlook code-level incompatibilities.<n>We propose PCREQ, the first approach to automatically infer compatible requirements by combining version and code compatibility analysis.
arXiv Detail & Related papers (2025-08-04T03:34:30Z) - SocialED: A Python Library for Social Event Detection [53.928241775629566]
SocialED is a comprehensive, open-source Python library designed to support social event detection (SED) tasks.
It provides a unified API with detailed documentation, offering researchers and practitioners a complete solution for event detection in social media.
SocialED supports a wide range of preprocessing techniques, such as graph construction and tokenization, and includes standardized interfaces for training models and making predictions.
arXiv Detail & Related papers (2024-12-18T03:37:47Z) - PyPulse: A Python Library for Biosignal Imputation [58.35269251730328]
We introduce PyPulse, a Python package for imputation of biosignals in both clinical and wearable sensor settings.
PyPulse's framework provides a modular and extendable framework with high ease-of-use for a broad userbase, including non-machine-learning bioresearchers.
We released PyPulse under the MIT License on Github and PyPI.
arXiv Detail & Related papers (2024-12-09T11:00:55Z) - A Large-scale Investigation of Semantically Incompatible APIs behind Compatibility Issues in Android Apps [13.24503570840706]
We conduct a large-scale discovery of incompatible APIs in the Android Open Source Project (AOSP)
We propose a unified framework to detect incompatible APIs, especially for semantic changes.
Our approach detects 5,481 incompatible APIs spanning from version 4 to version 33.
arXiv Detail & Related papers (2024-06-25T10:12:37Z) - Exception-aware Lifecycle Model Construction for Framework APIs [4.333061751230906]
This paper adopts a static analysis technique to extract exception summary information in the framework API code.
It generates exception-aware API lifecycle models for the given framework/library project.
Compared to the exception-unaware API lifecycle modeling on 60 versions, JavaExp can identify 18% times more API changes.
arXiv Detail & Related papers (2024-01-05T06:35:47Z) - PyBADS: Fast and robust black-box optimization in Python [11.4219428942199]
PyBADS is an implementation of the Adaptive Direct Search (BADS) algorithm for fast and robust black-box optimization.
It comes along with an easy-to-use Python interface for running the algorithm for running the results.
arXiv Detail & Related papers (2023-06-27T15:54:44Z) - Scalable and Precise Application-Centered Call Graph Construction for Python [4.655332013331494]
PyCG is the state-of-the-art approach for constructing call graphs for Python programs.
We propose a scalable and precise approach for constructing application-centered call graphs for Python programs, and implement it as a prototype tool JARVIS.
Taking one function as an input, JARVIS generates the call graph on-the-fly, where flow-sensitive intra-procedural analysis and inter-procedural analysis are conducted.
arXiv Detail & Related papers (2023-05-10T07:40:05Z) - PyHopper -- Hyperparameter optimization [51.40201315676902]
We present PyHopper, a black-box optimization platform for machine learning researchers.
PyHopper's goal is to integrate with existing code with minimal effort and run the optimization process with minimal necessary manual oversight.
With simplicity as the primary theme, PyHopper is powered by a single robust Markov-chain Monte-Carlo optimization algorithm.
arXiv Detail & Related papers (2022-10-10T14:35:01Z) - Repairing Bugs in Python Assignments Using Large Language Models [9.973714032271708]
We propose to use a large language model trained on code to build an APR system for programming assignments.
Our system can fix both syntactic and semantic mistakes by combining multi-modal prompts, iterative querying, test-case-based selection of few-shots, and program chunking.
We evaluate MMAPR on 286 real student programs and compare to a baseline built by combining a state-of-the-art Python syntax repair engine, BIFI, and state-of-the-art Python semantic repair engine for student assignments, Refactory.
arXiv Detail & Related papers (2022-09-29T15:41:17Z) - PyGOD: A Python Library for Graph Outlier Detection [56.33769221859135]
PyGOD is an open-source library for detecting outliers in graph data.
It supports a wide array of leading graph-based methods for outlier detection.
PyGOD is released under a BSD 2-Clause license at https://pygod.org and at the Python Package Index (PyPI)
arXiv Detail & Related papers (2022-04-26T06:15:21Z) - PyHHMM: A Python Library for Heterogeneous Hidden Markov Models [63.01207205641885]
PyHHMM is an object-oriented Python implementation of Heterogeneous-Hidden Markov Models (HHMMs)
PyHHMM emphasizes features not supported in similar available frameworks: a heterogeneous observation model, missing data inference, different model order selection criterias, and semi-supervised training.
PyHHMM relies on the numpy, scipy, scikit-learn, and seaborn Python packages, and is distributed under the Apache-2.0 License.
arXiv Detail & Related papers (2022-01-12T07:32:36Z) - PyHealth: A Python Library for Health Predictive Models [53.848478115284195]
PyHealth is an open-source Python toolbox for developing various predictive models on healthcare data.
The data preprocessing module enables the transformation of complex healthcare datasets into machine learning friendly formats.
The predictive modeling module provides more than 30 machine learning models, including established ensemble trees and deep neural network-based approaches.
arXiv Detail & Related papers (2021-01-11T22:02:08Z) - PySAD: A Streaming Anomaly Detection Framework in Python [0.0]
Streaming anomaly detection requires algorithms that operate under strict constraints.<n>We present PySAD, a comprehensive Python framework addressing these challenges through a unified architecture.
arXiv Detail & Related papers (2020-09-05T17:41:37Z) - MOGPTK: The Multi-Output Gaussian Process Toolkit [71.08576457371433]
We present MOGPTK, a Python package for multi-channel data modelling using Gaussian processes (GP)
The aim of this toolkit is to make multi-output GP (MOGP) models accessible to researchers, data scientists, and practitioners alike.
arXiv Detail & Related papers (2020-02-09T23:34:49Z) - OPFython: A Python-Inspired Optimum-Path Forest Classifier [68.8204255655161]
This paper proposes a Python-based Optimum-Path Forest framework, denoted as OPFython.
As OPFython is a Python-based library, it provides a more friendly environment and a faster prototyping workspace than the C language.
arXiv Detail & Related papers (2020-01-28T15:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.