Assistance or Disruption? Exploring and Evaluating the Design and Trade-offs of Proactive AI Programming Support
- URL: http://arxiv.org/abs/2502.18658v4
- Date: Mon, 08 Sep 2025 16:34:58 GMT
- Title: Assistance or Disruption? Exploring and Evaluating the Design and Trade-offs of Proactive AI Programming Support
- Authors: Kevin Pu, Daniel Lazaro, Ian Arawjo, Haijun Xia, Ziang Xiao, Tovi Grossman, Yan Chen,
- Abstract summary: We introduce and evaluate Codellaborator, a design probe agent that initiates programming assistance based on editor activities and task context.<n>We find that proactive agents increase efficiency compared to prompt-only paradigm, but also incur workflow disruptions.
- Score: 36.082282294551405
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI programming tools enable powerful code generation, and recent prototypes attempt to reduce user effort with proactive AI agents, but their impact on programming workflows remains unexplored. We introduce and evaluate Codellaborator, a design probe LLM agent that initiates programming assistance based on editor activities and task context. We explored three interface variants to assess trade-offs between increasingly salient AI support: prompt-only, proactive agent, and proactive agent with presence and context (Codellaborator). In a within-subject study (N=18), we find that proactive agents increase efficiency compared to prompt-only paradigm, but also incur workflow disruptions. However, presence indicators and interaction context support alleviated disruptions and improved users' awareness of AI processes. We underscore trade-offs of Codellaborator on user control, ownership, and code understanding, emphasizing the need to adapt proactivity to programming processes. Our research contributes to the design exploration and evaluation of proactive AI systems, presenting design implications on AI-integrated programming workflow.
Related papers
- Pushing Forward Pareto Frontiers of Proactive Agents with Behavioral Agentic Optimization [61.641777037967366]
Proactive large language model (LLM) agents aim to actively plan, query, and interact over multiple turns.<n>Agentic reinforcement learning (RL) has emerged as a promising solution for training such agents in multi-turn settings.<n>We propose BAO, an agentic RL framework that combines behavior enhancement to enrich proactive reasoning and information-gathering capabilities.
arXiv Detail & Related papers (2026-02-11T20:40:43Z) - AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios [49.90735676070039]
The capacity of AI agents to effectively handle tasks of increasing duration and complexity continues to grow.<n>We argue that current evaluations prioritize increasing task difficulty without sufficiently addressing the diversity of agentic tasks.<n>We propose AgentIF-OneDay, aimed at determining whether general users can utilize natural language instructions and AI agents to complete a diverse array of daily tasks.
arXiv Detail & Related papers (2026-01-28T13:49:18Z) - Developer Interaction Patterns with Proactive AI: A Five-Day Field Study [7.26202905367366]
We present a field study of proactive AI assistance in professional developer.<n>We examined AI interventions across 5,732 interaction points to understand how proactive suggestions are received.<n>Our findings reveal systematic patterns in human receptivity to proactive suggestions.
arXiv Detail & Related papers (2026-01-15T10:20:57Z) - Exploring Human-AI Collaboration Using Mental Models of Early Adopters of Multi-Agent Generative AI Tools [4.382163871275696]
We investigated how early adopters and developers conceptualize multi-agent Gen AI tools.<n>We conducted semi-structured interviews with 13 developers, all early adopters of multi-agent Gen AI technology who work at Microsoft.<n>We identified key challenges, including error propagation, unpredictable and unproductive agent loop behavior, and the need for clear communication to mitigate the layered transparency issues.
arXiv Detail & Related papers (2025-09-10T05:35:38Z) - Code with Me or for Me? How Increasing AI Automation Transforms Developer Workflows [66.1850490474361]
We conduct the first academic study to explore developer interactions with coding agents.<n>We evaluate two leading copilot and agentic coding assistants, GitHub Copilot and OpenHands.<n>Our results show agents have the potential to assist developers in ways that surpass copilots.
arXiv Detail & Related papers (2025-07-10T20:12:54Z) - Towards a Playground to Democratize Experimentation and Benchmarking of AI Agents for Network Troubleshooting [48.131257144711576]
We focus on the application of AI agents to network troubleshooting.<n>We elaborate on the need for a standardized, reproducible, and open benchmarking platform.
arXiv Detail & Related papers (2025-07-01T08:46:37Z) - Understanding Software Engineering Agents: A Study of Thought-Action-Result Trajectories [18.129031749321058]
Large Language Model (LLM)-based agents are increasingly employed to automate complex software engineering tasks.<n>Despite their widespread adoption, the internal decision-making processes of these agents remain largely unexplored.<n>We present a large-scale empirical study of the thought-action-result trajectories of three state-of-the-art LLM-based agents.
arXiv Detail & Related papers (2025-06-23T16:34:52Z) - Exploring Prompt Patterns in AI-Assisted Code Generation: Towards Faster and More Effective Developer-AI Collaboration [3.1861081539404137]
This paper explores the application of structured prompt patterns to minimize the number of interactions required for satisfactory AI-assisted code generation.<n>We analyzed seven distinct prompt patterns to evaluate their effectiveness in reducing back-and-forth communication between developers and AI.
arXiv Detail & Related papers (2025-06-02T12:43:08Z) - Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.<n>Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.<n>We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z) - Fine-Grained Appropriate Reliance: Human-AI Collaboration with a Multi-Step Transparent Decision Workflow for Complex Task Decomposition [14.413413322901409]
We propose to investigate the impact of a novel Multi-Step Transparent (MST) decision workflow on user reliance behaviors.<n>Our findings demonstrate that human-AI collaboration with an MST decision workflow can outperform one-step collaboration in specific contexts.<n>Our work highlights that there is no one-size-fits-all decision workflow that can help obtain optimal human-AI collaboration.
arXiv Detail & Related papers (2025-01-19T01:03:09Z) - How Developers Interact with AI: A Taxonomy of Human-AI Collaboration in Software Engineering [8.65285948382426]
We propose a taxonomy of interaction types between developers and AI tools, identifying eleven distinct interaction types.<n>Building on this taxonomy, we outline a research agenda focused on optimizing AI interactions, improving developer control, and addressing trust and usability challenges in AI-assisted development.
arXiv Detail & Related papers (2025-01-15T12:53:49Z) - AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds [12.464941027105306]
AI for IT Operations (AIOps) aims to automate complex operational tasks, such as fault localization and root cause analysis, to reduce human workload and minimize customer impact.<n>Recent advances in Large Language Models (LLMs) and AI agents are revolutionizing AIOps by enabling end-to-end and multitask automation.<n>We present AIOPSLAB, a framework that deploys microservice cloud environments, injects faults, generates workloads, and exports telemetry data but also orchestrates these components and provides interfaces for interacting with and evaluating agents.
arXiv Detail & Related papers (2025-01-12T04:17:39Z) - Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance [95.03771007780976]
We tackle the challenge of developing proactive agents capable of anticipating and initiating tasks without explicit human instructions.<n>First, we collect real-world human activities to generate proactive task predictions.<n>These predictions are labeled by human annotators as either accepted or rejected.<n>The labeled data is used to train a reward model that simulates human judgment.
arXiv Detail & Related papers (2024-10-16T08:24:09Z) - Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models [49.74265453289855]
Large language models (LLMs) are now accessible to anyone with a computer, a web browser, and an internet connection via browser-based interfaces.
This paper examines the affordances of interactive feedback features in ChatGPT's interface, analysing how they shape user input and participation in iteration.
arXiv Detail & Related papers (2024-08-27T13:50:37Z) - Compromising Embodied Agents with Contextual Backdoor Attacks [69.71630408822767]
Large language models (LLMs) have transformed the development of embodied intelligence.
This paper uncovers a significant backdoor security threat within this process.
By poisoning just a few contextual demonstrations, attackers can covertly compromise the contextual environment of a black-box LLM.
arXiv Detail & Related papers (2024-08-06T01:20:12Z) - WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? [83.19032025950986]
We study the use of large language model-based agents for interacting with software via web browsers.
WorkArena is a benchmark of 33 tasks based on the widely-used ServiceNow platform.
BrowserGym is an environment for the design and evaluation of such agents.
arXiv Detail & Related papers (2024-03-12T14:58:45Z) - The Foundations of Computational Management: A Systematic Approach to
Task Automation for the Integration of Artificial Intelligence into Existing
Workflows [55.2480439325792]
This article introduces Computational Management, a systematic approach to task automation.
The article offers three easy step-by-step procedures to begin the process of implementing AI within a workflow.
arXiv Detail & Related papers (2024-02-07T01:45:14Z) - Comparing Software Developers with ChatGPT: An Empirical Investigation [0.0]
This paper conducts an empirical investigation, contrasting the performance of software engineers and AI systems, like ChatGPT, across different evaluation metrics.
The paper posits that a comprehensive comparison of software engineers and AI-based solutions, considering various evaluation criteria, is pivotal in fostering human-machine collaboration.
arXiv Detail & Related papers (2023-05-19T17:25:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.