Context Engineering for AI Agents in Open-Source Software
- URL: http://arxiv.org/abs/2510.21413v2
- Date: Mon, 27 Oct 2025 08:53:34 GMT
- Title: Context Engineering for AI Agents in Open-Source Software
- Authors: Seyedmoein Mohsenimofidi, Matthias Galster, Christoph Treude, Sebastian Baltes,
- Abstract summary: GenAI-based coding assistants have disrupted software development.<n>Their next generation is agent-based, operating with more autonomy and potentially without human oversight.<n>One challenge is to provide AI agents with sufficient context about the software projects they operate in.
- Score: 13.236926479239754
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: GenAI-based coding assistants have disrupted software development. Their next generation is agent-based, operating with more autonomy and potentially without human oversight. One challenge is to provide AI agents with sufficient context about the software projects they operate in. Like humans, AI agents require contextual information to develop solutions that are in line with the target architecture, interface specifications, coding guidelines, standard workflows, and other project-specific policies. Popular AI agents for software development (e.g., Claude Code) advocate for maintaining tool-specific version-controlled Markdown files that cover aspects such as the project structure, building and testing, or code style. The content of these files is automatically added to each prompt. AGENTS$.$md has emerged as a potential standard that consolidates tool-specific formats. However, little is known about whether and how developers adopt this format. Therefore, in this paper, we present the results of a preliminary study investigating the adoption of AI configuration files in 466 open-source software projects, what information developers provide in these files, how they present that information, and how the files evolve over time. Our findings indicate that there is no established structure yet, and that there is a lot of variation in terms of how context is provided (descriptive, prescriptive, prohibitive, explanatory, conditional). We see great potential in studying which modifications in structure or presentation can positively affect the quality of the generated content. Finally, our analysis of commits modifying AGENTS$.$md files provides first insights into how projects continuously extend and maintain these files. We conclude the paper by outlining how the adoption of AI configuration files provides a unique opportunity to study real-world prompt and context engineering.
Related papers
- Supporting software engineering tasks with agentic AI: Demonstration on document retrieval and test scenario generation [0.0]
We introduce agentic AI solutions for two software engineering tasks.<n>First, we developed a solution for automatic test scenario generation from a detailed requirements description.<n>Second, we developed an agentic AI solution for the document retrieval task in the context of software engineering documents.
arXiv Detail & Related papers (2026-02-04T16:33:16Z) - Spec-Driven Development:From Code to Contract in the Age of AI Coding Assistants [0.0]
Spec-driven development (SDD) treats specifications as the source of truth and code as a generated or verified secondary artifact.<n>We present three levels of specification rigor-spec-first, spec-anchored, and spec-as-source-with clear guidance on when each applies.
arXiv Detail & Related papers (2026-01-30T04:45:42Z) - Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey [59.3507264893654]
Issue resolution is a complex Software Engineering task integral to real-world development.<n> benchmarks like SWE-bench revealed this task as profoundly difficult for large language models.<n>This paper presents a systematic survey of this emerging domain.
arXiv Detail & Related papers (2026-01-15T18:55:03Z) - An Empirical Study of Developer-Provided Context for AI Coding Assistants in Open-Source Projects [2.392035679895744]
This paper presents a large-scale empirical study to characterize the emerging form of developer-provided context.<n>We developed a comprehensive taxonomy of project context that developers consider essential, organized into five high-level themes.<n>Our study also explores how this context varies across different project types and programming languages.
arXiv Detail & Related papers (2025-12-21T23:51:02Z) - Everything is Context: Agentic File System Abstraction for Context Engineering [11.63011212134865]
This paper proposes a file-system abstraction for context engineering.<n>The abstraction offers a persistent, governed infrastructure for managing heterogeneous context artefacts.<n>As GenAI becomes an active collaborator in decision support, humans play a central role as curators, verifiers, and co-reasoners.
arXiv Detail & Related papers (2025-12-05T06:56:45Z) - Decoding the Configuration of AI Coding Agents: Insights from Claude Code Projects [0.1631115063641726]
Agentic code assistants are a new generation of AI systems capable of performing end-to-end software engineering tasks.<n>Their behavior and effectiveness depend heavily on configuration files that define architectural constraints, coding practices, and tool usage policies.<n>This paper presents an empirical study of the configuration ecosystem of Claude Code, one of the most widely used agentic coding systems.
arXiv Detail & Related papers (2025-11-12T12:28:57Z) - Vibe Coding: Toward an AI-Native Paradigm for Semantic and Intent-Driven Programming [0.0]
This paper introduces vibe coding, an emerging AI-native programming paradigm in which a developer specifies high-level functional intent along with qualitative descriptors of the desired "vibe"<n>An intelligent agent then transforms those specifications into executable software.
arXiv Detail & Related papers (2025-10-09T22:31:53Z) - DRBench: A Realistic Benchmark for Enterprise Deep Research [81.49694432639406]
DRBench is a benchmark for evaluating AI agents on complex, open-ended deep research tasks in enterprise settings.<n>We release 15 deep research tasks across 10 domains, such as Sales, Cybersecurity, and Compliance.
arXiv Detail & Related papers (2025-09-30T18:47:20Z) - OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use [101.57043903478257]
The dream to create AI assistants as capable and versatile as the fictional J.A.R.V.I.S from Iron Man has long captivated imaginations.<n>With the evolution of (multi-modal) large language models ((M)LLMs), this dream is closer to reality.<n>This survey aims to consolidate the state of OS Agents research, providing insights to guide both academic inquiry and industrial development.
arXiv Detail & Related papers (2025-08-06T14:33:45Z) - Unified Software Engineering agent as AI Software Engineer [14.733475669942276]
Large Language Model (LLM) technology has raised expectations for automated coding.<n>In this paper, we seek to understand this question by developing a Unified Software Engineering agent or USEagent.<n>We envision USEagent as the first draft of a future AI Software Engineer which can be a team member in future software development teams involving both AI and humans.
arXiv Detail & Related papers (2025-06-17T16:19:13Z) - Generative AI for Software Architecture. Applications, Challenges, and Future Directions [6.883775050854466]
We aim to systematically synthesize the use, rationale, contexts, usability, and future challenges of GenAI in software architecture.<n>Our review identified significant adoption of GenAI for architectural decision support and architectural reconstruction.
arXiv Detail & Related papers (2025-03-17T15:49:30Z) - Project Archetypes: A Blessing and a Curse for AI Development [3.2157163136267943]
The development of applications using machine learning and artificial intelligence provides a context in which existing archetypes might outdate and need to be questioned, adapted, or replaced.
We analyzed 36 interviews from 21 projects between IBM Watson and client companies and identified four project archetypes members initially used to understand the projects.
We then derive a new project archetype, cognitive computing project, from the interviews. It can inform future development projects based on AI-development platforms.
arXiv Detail & Related papers (2024-08-08T08:52:19Z) - Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration [64.19431011897515]
This paper presents Alibaba LingmaAgent, a novel Automated Software Engineering method designed to comprehensively understand and utilize whole software repositories for issue resolution.<n>Our approach introduces a top-down method to condense critical repository information into a knowledge graph, reducing complexity, and employs a Monte Carlo tree search based strategy.<n>In production deployment and evaluation at Alibaba Cloud, LingmaAgent automatically resolved 16.9% of in-house issues faced by development engineers, and solved 43.3% of problems after manual intervention.
arXiv Detail & Related papers (2024-06-03T15:20:06Z) - Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning [50.47568731994238]
Key method for creating Artificial Intelligence (AI) agents is Reinforcement Learning (RL)
This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.
arXiv Detail & Related papers (2023-12-22T17:57:57Z) - Generation Probabilities Are Not Enough: Uncertainty Highlighting in AI Code Completions [54.55334589363247]
We study whether conveying information about uncertainty enables programmers to more quickly and accurately produce code.
We find that highlighting tokens with the highest predicted likelihood of being edited leads to faster task completion and more targeted edits.
arXiv Detail & Related papers (2023-02-14T18:43:34Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.