Echoes of AI: Investigating the Downstream Effects of AI Assistants on Software Maintainability
- URL: http://arxiv.org/abs/2507.00788v1
- Date: Tue, 01 Jul 2025 14:24:37 GMT
- Title: Echoes of AI: Investigating the Downstream Effects of AI Assistants on Software Maintainability
- Authors: Markus Borg, Dave Hewett, Nadim Hagatulah, Noric Couderc, Emma Söderberg, Donald Graham, Uttam Kini, Dave Farley,
- Abstract summary: This study investigates whether co-development with AI assistants affects software maintainability.<n> AI-assisted development in Phase 1 led to a modest speedup in subsequent evolution.<n>For habitual AI users, the mean speedup was 55.9%.
- Score: 5.677464428950146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: [Context] AI assistants, like GitHub Copilot and Cursor, are transforming software engineering. While several studies highlight productivity improvements, their impact on maintainability requires further investigation. [Objective] This study investigates whether co-development with AI assistants affects software maintainability, specifically how easily other developers can evolve the resulting source code. [Method] We conducted a two-phase controlled experiment involving 151 participants, 95% of whom were professional developers. In Phase 1, participants added a new feature to a Java web application, with or without AI assistance. In Phase 2, a randomized controlled trial, new participants evolved these solutions without AI assistance. [Results] AI-assisted development in Phase 1 led to a modest speedup in subsequent evolution and slightly higher average CodeHealth. Although neither difference was significant overall, the increase in CodeHealth was statistically significant when habitual AI users completed Phase 1. For Phase 1, we also observed a significant effect that corroborates previous productivity findings: using an AI assistant yielded a 30.7% median decrease in task completion time. Moreover, for habitual AI users, the mean speedup was 55.9%. [Conclusions] Our study adds to the growing evidence that AI assistants can effectively accelerate development. Moreover, we did not observe warning signs of degraded code-level maintainability. We recommend that future research focus on risks such as code bloat from excessive code generation and the build-up of cognitive debt as developers invest less mental effort during implementation.
Related papers
- SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience [71.82719117238307]
We propose SEAgent, an agentic self-evolving framework enabling computer-use agents to evolve through interactions with unfamiliar software.<n>We validate the effectiveness of SEAgent across five novel software environments within OS-World.<n>Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA.
arXiv Detail & Related papers (2025-08-06T17:58:46Z) - Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity [0.5789840336223054]
Despite widespread adoption, the impact of AI tools on software development in the wild remains understudied.<n>We conduct a randomized controlled trial (RCT) to understand how AI tools at the February-June 2025 frontier affect the productivity of experienced open-source developers.<n>16 developers with moderate AI experience complete 246 tasks in mature projects on which they have an average of 5 years of prior experience.
arXiv Detail & Related papers (2025-07-12T00:16:33Z) - Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows [66.1850490474361]
We conduct the first academic study to explore developer interactions with coding agents.<n>We evaluate two leading copilot and agentic coding assistants, GitHub Copilot and OpenHands.<n>Our results show agents have the potential to assist developers in ways that surpass copilots.
arXiv Detail & Related papers (2025-07-10T20:12:54Z) - From Code Generation to Software Testing: AI Copilot with Context-Based RAG [8.28588489551341]
We propose a novel perspective on software testing by positing bug detection and coding with fewer bugs as two interconnected problems.<n>We introduce Copilot for Testing, an automated testing system that synchronizes bug detection with updates.<n>Our evaluation demonstrates a 31.2% improvement in bug detection accuracy, a 12.6% increase in critical test coverage, and a 10.5% higher user acceptance rate.
arXiv Detail & Related papers (2025-04-02T16:20:05Z) - Towards Decoding Developer Cognition in the Age of AI Assistants [9.887133861477233]
We propose a controlled observational study combining physiological measurements (EEG and eye tracking) with interaction data to examine developers' use of AI-assisted programming tools.<n>We will recruit professional developers to complete programming tasks both with and without AI assistance while measuring their cognitive load and task completion time.
arXiv Detail & Related papers (2025-01-05T23:25:21Z) - How much does AI impact development speed? An enterprise-based randomized controlled trial [8.759453531975668]
We estimate the impact of three AI features on the time developers spent on a complex, enterprise-grade task.
We also found an interesting effect whereby developers who spend more hours on code-related activities per day were faster with AI.
arXiv Detail & Related papers (2024-10-16T18:31:14Z) - Does Co-Development with AI Assistants Lead to More Maintainable Code? A Registered Report [6.7428644467224]
This study aims to examine the influence of AI assistants on software maintainability.
In Phase 1, developers will add a new feature to a Java project, with or without the aid of an AI assistant.
In Phase 2, a randomized controlled trial, will involve a different set of developers evolving random Phase 1 projects - working without AI assistants.
arXiv Detail & Related papers (2024-08-20T11:48:42Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning [82.46975428739329]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.<n>We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.<n>These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Exploration with Principles for Diverse AI Supervision [88.61687950039662]
Training large transformers using next-token prediction has given rise to groundbreaking advancements in AI.
While this generative AI approach has produced impressive results, it heavily leans on human supervision.
This strong reliance on human oversight poses a significant hurdle to the advancement of AI innovation.
We propose a novel paradigm termed Exploratory AI (EAI) aimed at autonomously generating high-quality training data.
arXiv Detail & Related papers (2023-10-13T07:03:39Z) - Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code.
We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z) - Does the Whole Exceed its Parts? The Effect of AI Explanations on
Complementary Team Performance [44.730580857733]
Prior studies observed improvements from explanations only when the AI, alone, outperformed both the human and the best team.
We conduct mixed-method user studies on three datasets, where an AI with accuracy comparable to humans helps participants solve a task.
We find explanations increase the chance that humans will accept the AI's recommendation, regardless of its correctness.
arXiv Detail & Related papers (2020-06-26T03:34:04Z) - Adversarial vs behavioural-based defensive AI with joint, continual and
active learning: automated evaluation of robustness to deception, poisoning
and concept drift [62.997667081978825]
Recent advancements in Artificial Intelligence (AI) have brought new capabilities to behavioural analysis (UEBA) for cyber-security.
In this paper, we present a solution to effectively mitigate this attack by improving the detection process and efficiently leveraging human expertise.
arXiv Detail & Related papers (2020-01-13T13:54:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.