Related papers: Using Copilot Agent Mode to Automate Library Migration: A Quantitative Assessment

Using Copilot Agent Mode to Automate Library Migration: A Quantitative Assessment

URL: http://arxiv.org/abs/2510.26699v1
Date: Thu, 30 Oct 2025 17:05:13 GMT
Title: Using Copilot Agent Mode to Automate Library Migration: A Quantitative Assessment
Authors: Aylton Almeida, Laerte Xavier, Marco Tulio Valente,
Abstract summary: Keeping software systems up to date is essential to avoid technical debt, security vulnerabilities, and the rigidity typical of legacy systems.<n>Recent advances in Large Language Models (LLMs) and agentic coding systems offer new opportunities for automating such maintenance tasks.
Score: 0.5735035463793009
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Keeping software systems up to date is essential to avoid technical debt, security vulnerabilities, and the rigidity typical of legacy systems. However, updating libraries and frameworks remains a time consuming and error-prone process. Recent advances in Large Language Models (LLMs) and agentic coding systems offer new opportunities for automating such maintenance tasks. In this paper, we evaluate the update of a well-known Python library, SQLAlchemy, across a dataset of ten client applications. For this task, we use the Github's Copilot Agent Mode, an autonomous AI systema capable of planning and executing multi-step migration workflows. To assess the effectiveness of the automated migration, we also introduce Migration Coverage, a metric that quantifies the proportion of API usage points correctly migrated. The results of our study show that the LLM agent was capable of migrating functionalities and API usages between SQLAlchemy versions (migration coverage: 100%, median), but failed to maintain the application functionality, leading to a low test-pass rate (39.75%, median).

Related papers

MEnvAgent: Scalable Polyglot Environment Construction for Verifiable Software Engineering [54.236614097082395]
We introduce MEnvAgent, a framework for automated Environment construction.<n>MEnvAgent employs a multi-agent Planning-Execution-Verification architecture to autonomously resolve construction failures.<n>MEnvData-SWE is the largest open-source polyglot dataset of realistic verifiable Docker environments to date.
arXiv Detail & Related papers (2026-01-30T11:36:10Z)
TimeMachine-bench: A Benchmark for Evaluating Model Capabilities in Repository-Level Migration Tasks [12.573674060643787]
TimeMachine-bench is a benchmark designed to evaluate software migration in real-world Python projects.<n>Our benchmark consists of GitHub repositories whose tests begin to fail in response to dependency updates.
arXiv Detail & Related papers (2026-01-30T05:42:45Z)
FreshBrew: A Benchmark for Evaluating AI Agents on Java Code Migration [2.981397088242044]
We introduce FreshBrew, a novel benchmark for evaluating AI agents on project-level Java migrations.<n>We benchmark several state-of-the-art LLMs, and compare their performance against established rule-based tools.<n>Our evaluation of AI agents on this benchmark of 228 repositories shows that the top-performing model, 2.5 Gemini Flash, can successfully migrate 52.3 percent of projects to 17.
arXiv Detail & Related papers (2025-10-06T14:39:58Z)
Automatic Qiskit Code Refactoring Using Large Language Models [39.71511919246829]
We present a novel methodology for Qiskit code using large language models (LLMs)<n>We begin by extracting a taxonomy of migration scenarios from the different sources of official Qiskit documentation.<n>This taxonomy, along with the original Python source code, is provided as input to an LLM, which is then tasked with identifying instances of migration scenarios in the code.
arXiv Detail & Related papers (2025-06-17T14:00:48Z)
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems [50.29939179830491]
Failure attribution in LLM multi-agent systems remains underexplored and labor-intensive.<n>We develop and evaluate three automated failure attribution methods, summarizing their corresponding pros and cons.<n>The best method achieves 53.5% accuracy in identifying failure-responsible agents but only 14.2% in pinpointing failure steps.
arXiv Detail & Related papers (2025-04-30T23:09:44Z)
AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.<n>Recent works have started exploiting large language models (LLM) to lessen such burden.<n>This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z)
Automated Machine Learning in Insurance [0.0]
This paper introduces an AutoML workflow that allows users without domain knowledge or prior experience to achieve robust and effortless ML deployment by writing only a few lines of code. This proposed AutoML is specifically tailored for the insurance application, with features like the balancing step in data preprocessing, ensemble pipelines, and customized loss functions.
arXiv Detail & Related papers (2024-08-26T14:55:40Z)
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering. Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications. These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z)
Example-Based Automatic Migration of Continuous Integration Systems [2.2836654317217326]
Continuous Integration (CI) is a widely adopted practice for faster code change integration and testing. Developers often migrate between CI systems in pursuit of features like matrix building or better logging. This migration is effort intensive and error-prone owing to limited knowledge of the new CI system and its syntax. We propose a novel approach for CI system's automatic migration: CIMig.
arXiv Detail & Related papers (2024-07-02T20:19:21Z)
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning [56.887047551101574]
We present DS-Agent, a novel framework that harnesses large language models (LLMs) agent and case-based reasoning (CBR) In the development stage, DS-Agent follows the CBR framework to structure an automatic iteration pipeline, which can flexibly capitalize on the expert knowledge from Kaggle. In the deployment stage, DS-Agent implements a low-resource deployment stage with a simplified CBR paradigm, significantly reducing the demand on foundational capabilities of LLMs.
arXiv Detail & Related papers (2024-02-27T12:26:07Z)
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System [85.8338446357469]
We introduce OmniForce, a human-centered AutoML system that yields both human-assisted ML and ML-assisted human techniques. We show how OmniForce can put an AutoML system into practice and build adaptive AI in open-environment scenarios.
arXiv Detail & Related papers (2023-03-01T13:35:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.