Related papers: Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows

Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows

URL: http://arxiv.org/abs/2507.08149v1
Date: Thu, 10 Jul 2025 20:12:54 GMT
Title: Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows
Authors: Valerie Chen, Ameet Talwalkar, Robert Brennan, Graham Neubig,
Abstract summary: We conduct the first academic study to explore developer interactions with coding agents.<n>We evaluate two leading copilot and agentic coding assistants, GitHub Copilot and OpenHands.<n>Our results show agents have the potential to assist developers in ways that surpass copilots.
Score: 66.1850490474361
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Developers now have access to a growing array of increasingly autonomous AI tools to support software development. While numerous studies have examined developer use of copilots, which can provide chat assistance or code completions, evaluations of coding agents, which can automatically write files and run code, still largely rely on static benchmarks without humans-in-the-loop. In this work, we conduct the first academic study to explore developer interactions with coding agents and characterize how more autonomous AI tools affect user productivity and experience, compared to existing copilots. We evaluate two leading copilot and agentic coding assistants, GitHub Copilot and OpenHands, recruiting participants who regularly use the former. Our results show agents have the potential to assist developers in ways that surpass copilots (e.g., completing tasks that humans might not have accomplished before) and reduce the user effort required to complete tasks. However, there are challenges involved in enabling their broader adoption, including how to ensure users have an adequate understanding of agent behaviors. Our results not only provide insights into how developer workflows change as a result of coding agents but also highlight how user interactions with agents differ from those with existing copilots, motivating a set of recommendations for researchers building new agents. Given the broad set of developers who still largely rely on copilot-like systems, our work highlights key challenges of adopting more agentic systems into developer workflows.

Related papers

A Human Centric Requirements Engineering Framework for Assessing Github Copilot Output [0.0]
GitHub Copilot introduces new challenges in how these software tools address human needs.<n>I analyzed GitHub Copilot's interaction with users through its chat interface.<n>I established a human-centered requirements framework with clear metrics to evaluate these qualities.
arXiv Detail & Related papers (2025-08-05T21:33:23Z)
The Rise of AI Teammates in Software Engineering (SE) 3.0: How Autonomous Coding Agents Are Reshaping Software Engineering [10.252332355171237]
This paper introduces AIDev, the first largescale dataset capturing how such agents operate in the wild.<n>Spanning over 456,000 pull requests by five leading agents, AIDev provides an unprecedented empirical foundation for studying autonomous teammates in software development.<n>The dataset includes rich on PRs, authorship, review timelines, code changes, and integration outcomes.
arXiv Detail & Related papers (2025-07-20T15:15:58Z)
From Developer Pairs to AI Copilots: A Comparative Study on Knowledge Transfer [8.567835367628787]
With the rise of AI coding assistants, developers now not only work with human partners but also, as some claim, with AI pair programmers.<n>To analyze knowledge transfer in both human-human and human-AI settings, we conducted an empirical study.<n>We found a similar frequency of successful knowledge transfer episodes and overlapping topical categories across both settings.
arXiv Detail & Related papers (2025-06-05T09:13:30Z)
R&D-Agent: Automating Data-Driven AI Solution Building Through LLM-Powered Automated Research, Development, and Evolution [60.80016554091364]
R&D-Agent is a dual-agent framework for iterative exploration.<n>The Researcher agent uses performance feedback to generate ideas, while the Developer agent refines code based on error feedback.<n>R&D-Agent is evaluated on MLE-Bench and emerges as the top-performing machine learning engineering agent.
arXiv Detail & Related papers (2025-05-20T06:07:00Z)
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation [70.3224918173672]
CowPilot is a framework supporting autonomous as well as human-agent collaborative web navigation.<n>It reduces the number of steps humans need to perform by allowing agents to propose next steps, while users are able to pause, reject, or take alternative actions.<n>CowPilot can serve as a useful tool for data collection and agent evaluation across websites.
arXiv Detail & Related papers (2025-01-28T00:56:53Z)
Towards Decoding Developer Cognition in the Age of AI Assistants [9.887133861477233]
We propose a controlled observational study combining physiological measurements (EEG and eye tracking) with interaction data to examine developers' use of AI-assisted programming tools.<n>We will recruit professional developers to complete programming tasks both with and without AI assistance while measuring their cognitive load and task completion time.
arXiv Detail & Related papers (2025-01-05T23:25:21Z)
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks [52.46737975742287]
We introduce TheAgentCompany, a benchmark for evaluating AI agents that interact with the world in similar ways to those of a digital worker.<n>We find that the most competitive agent can complete 30% of tasks autonomously.<n>This paints a nuanced picture on task automation with simulating LM agents in a setting a real workplace.
arXiv Detail & Related papers (2024-12-18T18:55:40Z)
ChatCollab: Exploring Collaboration Between Humans and AI Agents in Software Teams [1.3967206132709542]
ChatCollab's novel architecture allows agents - human or AI - to join collaborations in any role.<n>Using software engineering as a case study, we find that our AI agents successfully identify their roles and responsibilities.<n>In relation to three prior multi-agent AI systems for software development, we find ChatCollab AI agents produce comparable or better software in an interactive game development task.
arXiv Detail & Related papers (2024-12-02T21:56:46Z)
Does Co-Development with AI Assistants Lead to More Maintainable Code? A Registered Report [6.7428644467224]
This study aims to examine the influence of AI assistants on software maintainability. In Phase 1, developers will add a new feature to a Java project, with or without the aid of an AI assistant. In Phase 2, a randomized controlled trial, will involve a different set of developers evolving random Phase 1 projects - working without AI assistants.
arXiv Detail & Related papers (2024-08-20T11:48:42Z)
OpenHands: An Open Platform for AI Software Developers as Generalist Agents [109.8507367518992]
We introduce OpenHands, a platform for the development of AI agents that interact with the world in similar ways to a human developer.<n>We describe how the platform allows for the implementation of new agents, safe interaction with sandboxed environments for code execution, and incorporation of evaluation benchmarks.
arXiv Detail & Related papers (2024-07-23T17:50:43Z)
Generative AI for Pull Request Descriptions: Adoption, Impact, and Developer Interventions [11.620351603683496]
GitHub's Copilot for Pull Requests (PRs) is a promising service aiming to automate various developer tasks related to PRs. In this study, we examine 18,256 PRs in which parts of the descriptions were crafted by generative AI. Our findings indicate that Copilot for PRs, though in its infancy, is seeing a marked uptick in adoption.
arXiv Detail & Related papers (2024-02-14T06:20:57Z)
Experiential Co-Learning of Software-Developing Agents [83.34027623428096]
Large language models (LLMs) have brought significant changes to various domains, especially in software development. We introduce Experiential Co-Learning, a novel LLM-agent learning framework. Experiments demonstrate that the framework enables agents to tackle unseen software-developing tasks more effectively.
arXiv Detail & Related papers (2023-12-28T13:50:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.