Developer Productivity With and Without GitHub Copilot: A Longitudinal Mixed-Methods Case Study
- URL: http://arxiv.org/abs/2509.20353v1
- Date: Wed, 24 Sep 2025 17:55:56 GMT
- Title: Developer Productivity With and Without GitHub Copilot: A Longitudinal Mixed-Methods Case Study
- Authors: Viktoria Stray, Elias Goldmann Brandtzæg, Viggo Tellefsen Wivestad, Astri Barbala, Nils Brede Moe,
- Abstract summary: This study investigates the real-world impact of the generative AI (GenAI) tool GitHub Copilot on developer activity and perceived productivity.<n>We analyzed 26,317 unique non-merge commits from 703 of NAV IT's GitHub repositories over a two-year period.<n>Our analysis of activity metrics revealed that individuals who used Copilot were consistently more active than non-users, even prior to Copilot's introduction.
- Score: 1.745178216563833
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study investigates the real-world impact of the generative AI (GenAI) tool GitHub Copilot on developer activity and perceived productivity. We conducted a mixed-methods case study in NAV IT, a large public sector agile organization. We analyzed 26,317 unique non-merge commits from 703 of NAV IT's GitHub repositories over a two-year period, focusing on commit-based activity metrics from 25 Copilot users and 14 non-users. The analysis was complemented by survey responses on their roles and perceived productivity, as well as 13 interviews. Our analysis of activity metrics revealed that individuals who used Copilot were consistently more active than non-users, even prior to Copilot's introduction. We did not find any statistically significant changes in commit-based activity for Copilot users after they adopted the tool, although minor increases were observed. This suggests a discrepancy between changes in commit-based metrics and the subjective experience of productivity.
Related papers
- Impacts of Generative AI on Agile Teams' Productivity: A Multi-Case Longitudinal Study [5.9568322124195845]
Generative Artificial Intelligence (GenAI) tools represent a paradigm shift in software engineering.<n>This study aims to provide a longitudinal evaluation of GenAI's impact on agile software teams.
arXiv Detail & Related papers (2026-02-14T13:26:16Z) - ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration [68.89572566071575]
ETAgent is a training framework for calibrating agent's tool-use behavior.<n>It is designed to progressively calibrate erroneous behavioral patterns to optimal behaviors.
arXiv Detail & Related papers (2026-01-11T11:05:26Z) - Comprehension-Performance Gap in GenAI-Assisted Brownfield Programming: A Replication and Extension [0.41998444721319217]
Code comprehension is essential for brownfield programming tasks.<n>Generative AI (GenAI) coding assistants such as GitHub Copilot have been shown to improve developer productivity.<n>We explore both performance and comprehension in GenAI-assisted brownfield programming tasks.
arXiv Detail & Related papers (2025-11-04T19:03:55Z) - BugPilot: Complex Bug Generation for Efficient Learning of SWE Skills [59.003563837981886]
High quality bugs are key to training the next generation of language model based software engineering (SWE) agents.<n>We introduce a novel method for synthetic generation of difficult and diverse bugs.
arXiv Detail & Related papers (2025-10-22T17:58:56Z) - Holistic Agent Leaderboard: The Missing Infrastructure for AI Agent Evaluation [87.47155146067962]
We provide a standardized evaluation harness that orchestrates parallel evaluations across hundreds of tasks.<n>We conduct three-dimensional analysis spanning models, scaffolds, and benchmarks.<n>Our analysis reveals surprising insights, such as higher reasoning effort reducing accuracy in the majority of runs.
arXiv Detail & Related papers (2025-10-13T22:22:28Z) - Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows [60.04362496037186]
We present the first controlled study of developer interactions with coding agents.<n>We evaluate two leading copilot and agentic coding assistants.<n>Our results show agents can assist developers in ways that surpass copilots.
arXiv Detail & Related papers (2025-07-10T20:12:54Z) - A Qualitative Study of User Perception of M365 AI Copilot [11.684396657620981]
We present results from a six month trial of M365 Copilot conducted at our organisation in 2024.<n>The study explored user perceptions of M365 Copilot's effectiveness, productivity impact, evolving expectations, ethical concerns, and overall satisfaction.<n>While M365 Copilot demonstrated value in specific operational areas, its broader impact remained constrained by usability limitations and the need for human oversight.
arXiv Detail & Related papers (2025-03-22T06:11:10Z) - WHALES: A Multi-Agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving [14.977628973132099]
We introduce WHALES, the first large-scale V2X dataset explicitly designed to benchmark communication-aware agent scheduling and scalable cooperative perception.<n>WHALES features an average of 8.4 cooperative agents per scene and 2.01 million annotated 3D objects across diverse traffic scenarios.<n>It incorporates detailed communication metadata to emulate real-world communication bottlenecks, enabling rigorous evaluation of scheduling strategies.
arXiv Detail & Related papers (2024-11-20T14:12:34Z) - Impact of the Availability of ChatGPT on Software Development: A Synthetic Difference in Differences Estimation using GitHub Data [49.1574468325115]
ChatGPT is an AI tool that enhances software production efficiency.
We estimate ChatGPT's effects on the number of git pushes, repositories, and unique developers per 100,000 people.
These results suggest that AI tools like ChatGPT can substantially boost developer productivity, though further analysis is needed to address potential downsides such as low quality code and privacy concerns.
arXiv Detail & Related papers (2024-06-16T19:11:15Z) - Impact of AI-tooling on the Engineering Workspace [0.0]
Significant changes were observed in coding time fractions among Copilot users.
Some companies experienced a decrease in PR pickup times by up to 33%.
One company experienced a shift of up to 17% of effort from maintenance and support work towards product growth initiatives.
arXiv Detail & Related papers (2024-06-11T20:04:09Z) - The Impact of AI Tool on Engineering at ANZ Bank An Empirical Study on GitHub Copilot within Corporate Environment [0.0]
This study explores the integration of AI tools in software engineering practices within a large organization.
We focus on ANZ Bank, which employs over 5000 engineers covering all aspects of the software development life cycle.
This paper details an experiment conducted using GitHub Copilot, a notable AI tool, within a controlled environment to evaluate its effectiveness in real-world engineering tasks.
arXiv Detail & Related papers (2024-02-08T12:47:57Z) - Measuring the Runtime Performance of C++ Code Written by Humans using GitHub Copilot [1.4665528337423246]
We evaluate the runtime performance of C++ code produced when developers use GitHub Copilot versus when they do not.<n>Our results suggest that using Copilot may produce C++ code with (statistically significant) slower runtime performance.
arXiv Detail & Related papers (2023-05-10T20:14:52Z) - CLUTR: Curriculum Learning via Unsupervised Task Representation Learning [130.79246770546413]
CLUTR is a novel curriculum learning algorithm that decouples task representation and curriculum learning into a two-stage optimization.
We show CLUTR outperforms PAIRED, a principled and popular UED method, in terms of generalization and sample efficiency in the challenging CarRacing and navigation environments.
arXiv Detail & Related papers (2022-10-19T01:45:29Z) - Data-driven Koopman Operators for Model-based Shared Control of
Human-Machine Systems [66.65503164312705]
We present a data-driven shared control algorithm that can be used to improve a human operator's control of complex machines.
Both the dynamics and information about the user's interaction are learned from observation through the use of a Koopman operator.
We find that model-based shared control significantly improves task and control metrics when compared to a natural learning, or user only, control paradigm.
arXiv Detail & Related papers (2020-06-12T14:14:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.