Related papers: How much does AI impact development speed? An enterprise-based randomized controlled trial

How much does AI impact development speed? An enterprise-based randomized controlled trial

URL: http://arxiv.org/abs/2410.12944v2
Date: Thu, 24 Oct 2024 20:40:08 GMT
Title: How much does AI impact development speed? An enterprise-based randomized controlled trial
Authors: Elise Paradis, Kate Grey, Quinn Madison, Daye Nam, Andrew Macvean, Vahid Meimand, Nan Zhang, Ben Ferrari-Church, Satish Chandra,
Abstract summary: We estimate the impact of three AI features on the time developers spent on a complex, enterprise-grade task. We also found an interesting effect whereby developers who spend more hours on code-related activities per day were faster with AI.
Score: 8.759453531975668
License: http://creativecommons.org/licenses/by/4.0/
Abstract: How much does AI assistance impact developer productivity? To date, the software engineering literature has provided a range of answers, targeting a diversity of outcomes: from perceived productivity to speed on task and developer throughput. Our randomized controlled trial with 96 full-time Google software engineers contributes to this literature by sharing an estimate of the impact of three AI features on the time developers spent on a complex, enterprise-grade task. We found that AI significantly shortened the time developers spent on task. Our best estimate of the size of this effect, controlling for factors known to influence developer time on task, stands at about 21\%, although our confidence interval is large. We also found an interesting effect whereby developers who spend more hours on code-related activities per day were faster with AI. Product and future research considerations are discussed. In particular, we invite further research that explores the impact of AI at the ecosystem level and across multiple suites of AI-enhanced tools, since we cannot assume that the effect size obtained in our lab study will necessarily apply more broadly, or that the effect of AI found using internal Google tooling in the summer of 2024 will translate across tools and over time.

Related papers

The SPACE of AI: Real-World Lessons on AI's Impact on Developers [0.807084206814932]
We study how developers perceive AI's influence across the dimensions of the SPACE framework: Satisfaction, Performance, Activity, Collaboration and Efficiency.<n>We find that AI is broadly adopted and widely seen as enhancing productivity, particularly for routine tasks.<n>Developers report increased efficiency and satisfaction, with less evidence of impact on collaboration.
arXiv Detail & Related papers (2025-07-31T21:45:54Z)
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity [0.5789840336223054]
Despite widespread adoption, the impact of AI tools on software development in the wild remains understudied.<n>We conduct a randomized controlled trial (RCT) to understand how AI tools at the February-June 2025 frontier affect the productivity of experienced open-source developers.<n>16 developers with moderate AI experience complete 246 tasks in mature projects on which they have an average of 5 years of prior experience.
arXiv Detail & Related papers (2025-07-12T00:16:33Z)
Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows [66.1850490474361]
We conduct the first academic study to explore developer interactions with coding agents.<n>We evaluate two leading copilot and agentic coding assistants, GitHub Copilot and OpenHands.<n>Our results show agents have the potential to assist developers in ways that surpass copilots.
arXiv Detail & Related papers (2025-07-10T20:12:54Z)
SciMaster: Towards General-Purpose Scientific AI Agents, Part I. X-Master as Foundation: Can We Lead on Humanity's Last Exam? [51.112225746095746]
We introduce X-Master, a tool-augmented reasoning agent designed to emulate human researchers.<n>X-Masters sets a new state-of-the-art record on Humanity's Last Exam with a score of 32.1%.
arXiv Detail & Related papers (2025-07-07T17:50:52Z)
Echoes of AI: Investigating the Downstream Effects of AI Assistants on Software Maintainability [5.677464428950146]
This study investigates whether co-development with AI assistants affects software maintainability.<n> AI-assisted development in Phase 1 led to a modest speedup in subsequent evolution.<n>For habitual AI users, the mean speedup was 55.9%.
arXiv Detail & Related papers (2025-07-01T14:24:37Z)
Challenges and Paths Towards AI for Software Engineering [55.95365538122656]
We discuss progress in AI for software engineering in threefold manner. First, we provide a structured taxonomy of concrete tasks in AI for software engineering. Second, we outline several key bottlenecks that limit current approaches.
arXiv Detail & Related papers (2025-03-28T17:17:57Z)
Towards Decoding Developer Cognition in the Age of AI Assistants [9.887133861477233]
We propose a controlled observational study combining physiological measurements (EEG and eye tracking) with interaction data to examine developers' use of AI-assisted programming tools. We will recruit professional developers to complete programming tasks both with and without AI assistance while measuring their cognitive load and task completion time.
arXiv Detail & Related papers (2025-01-05T23:25:21Z)
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks [52.46737975742287]
We build a self-contained environment with data that mimics a small software company environment. We find that with the most competitive agent, 24% of the tasks can be completed autonomously. This paints a nuanced picture on task automation with LM agents.
arXiv Detail & Related papers (2024-12-18T18:55:40Z)
Dear Diary: A randomized controlled trial of Generative AI coding tools in the workplace [2.5280615594444567]
Generative AI coding tools are relatively new, and their impact on developers extends beyond traditional coding metrics. This study aims to illuminate developers' preexisting beliefs about generative AI tools, their self perceptions, and how regular use of these tools may alter these beliefs. Our findings reveal that the introduction and sustained use of generative AI coding tools significantly increases developers' perceptions of these tools as both useful and enjoyable.
arXiv Detail & Related papers (2024-10-24T00:07:27Z)
Raising the Stakes: Performance Pressure Improves AI-Assisted Decision Making [57.53469908423318]
We show the effects of performance pressure on AI advice reliance when laypeople complete a common AI-assisted task. We find that when the stakes are high, people use AI advice more appropriately than when stakes are lower, regardless of the presence of an AI explanation.
arXiv Detail & Related papers (2024-10-21T22:39:52Z)
Does Co-Development with AI Assistants Lead to More Maintainable Code? A Registered Report [6.7428644467224]
This study aims to examine the influence of AI assistants on software maintainability. In Phase 1, developers will add a new feature to a Java project, with or without the aid of an AI assistant. In Phase 2, a randomized controlled trial, will involve a different set of developers evolving random Phase 1 projects - working without AI assistants.
arXiv Detail & Related papers (2024-08-20T11:48:42Z)
Towards the Terminator Economy: Assessing Job Exposure to AI through LLMs [10.844598404826355]
One-third of U.S. employment is highly exposed to AI, primarily in high-skill jobs. This exposure correlates positively with employment and wage growth from 2019 to 2023.
arXiv Detail & Related papers (2024-07-27T08:14:18Z)
Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data [49.1574468325115]
ChatGPT is an AI tool that enhances software production efficiency. We estimate ChatGPT's effects on the number of git pushes, repositories, and unique developers per 100,000 people. These results suggest that AI tools like ChatGPT can substantially boost developer productivity, though further analysis is needed to address potential downsides such as low quality code and privacy concerns.
arXiv Detail & Related papers (2024-06-16T19:11:15Z)
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? [83.19032025950986]
We study the use of large language model-based agents for interacting with software via web browsers. WorkArena is a benchmark of 33 tasks based on the widely-used ServiceNow platform. BrowserGym is an environment for the design and evaluation of such agents.
arXiv Detail & Related papers (2024-03-12T14:58:45Z)
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment. We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation. These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z)
The Potential Impact of AI Innovations on U.S. Occupations [3.0829845709781725]
We employ Deep Learning Natural Language Processing to identify AI patents that may impact various occupational tasks at scale. Our methodology relies on a comprehensive dataset of 17,879 task descriptions and quantifies AI's potential impact. Our results reveal that some occupations will potentially be impacted, and that impact is intricately linked to specific skills.
arXiv Detail & Related papers (2023-12-07T21:44:07Z)
Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code. We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z)
Artificial Intelligence in Software Testing : Impact, Problems, Challenges and Prospect [0.0]
The study aims to recognize and explain some of the biggest challenges software testers face while applying AI to testing. The paper also proposes some key contributions of AI in the future to the domain of software testing.
arXiv Detail & Related papers (2022-01-14T10:21:51Z)
AI Explainability 360: Impact and Design [120.95633114160688]
In 2019, we created AI Explainability 360 (Arya et al. 2020), an open source software toolkit featuring ten diverse and state-of-the-art explainability methods. This paper examines the impact of the toolkit with several case studies, statistics, and community feedback. The paper also describes the flexible design of the toolkit, examples of its use, and the significant educational material and documentation available to its users.
arXiv Detail & Related papers (2021-09-24T19:17:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.