Related papers: Agent READMEs: An Empirical Study of Context Files for Agentic Coding

Agent READMEs: An Empirical Study of Context Files for Agentic Coding

URL: http://arxiv.org/abs/2511.12884v1
Date: Mon, 17 Nov 2025 02:18:55 GMT
Title: Agent READMEs: An Empirical Study of Context Files for Agentic Coding
Authors: Worawalan Chatlatanagulchai, Hao Li, Yutaro Kashiwa, Brittany Reid, Kundjanasith Thonglek, Pattara Leelaprute, Arnon Rungsawang, Bundit Manaskasemsak, Bram Adams, Ahmed E. Hassan, Hajimu Iida,
Abstract summary: We study 2,303 agent context files from 1,925 repositories to characterize their structure, maintenance, and content.<n>We find that these files are not static documentation but complex, difficult-to-read artifacts that evolve like configuration code, maintained through frequent, small additions.<n>These findings indicate that while developers use context files to make agents functional, they provide few guardrails to ensure that agent-written code is secure or performant, highlighting the need for improved tooling and practices.
Score: 8.019313057979522
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agentic coding tools receive goals written in natural language as input, break them down into specific tasks, and write or execute the actual code with minimal human intervention. Central to this process are agent context files ("READMEs for agents") that provide persistent, project-level instructions. In this paper, we conduct the first large-scale empirical study of 2,303 agent context files from 1,925 repositories to characterize their structure, maintenance, and content. We find that these files are not static documentation but complex, difficult-to-read artifacts that evolve like configuration code, maintained through frequent, small additions. Our content analysis of 16 instruction types shows that developers prioritize functional context, such as build and run commands (62.3%), implementation details (69.9%), and architecture (67.7%). We also identify a significant gap: non-functional requirements like security (14.5%) and performance (14.5%) are rarely specified. These findings indicate that while developers use context files to make agents functional, they provide few guardrails to ensure that agent-written code is secure or performant, highlighting the need for improved tooling and practices.

Related papers

Codified Context: Infrastructure for AI Agents in a Complex Codebase [0.0]
This paper presents a three-component codified context infrastructure developed during construction of a 108,000-line C# distributed system.<n>The framework is published as an open-source companion repository.
arXiv Detail & Related papers (2026-02-24T02:11:26Z)
CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence [0.0]
We identify the Navigation Paradox: agents perform poorly because navigation and retrieval are fundamentally distinct problems.<n>We demonstrate that graph-based structural navigation via Code--a Model Context Protocol server exposing dependency graphs--achieves 99.4% task completion on hidden-dependency tasks.
arXiv Detail & Related papers (2026-02-23T16:58:37Z)
Configuring Agentic AI Coding Tools: An Exploratory Study [11.643977424519]
We present a systematic analysis of configuration mechanisms for agentic AI coding tools, covering Claude Code, GitHub Copilot, Cursor, Gemini, and Codex.<n>We identify eight configuration mechanisms and, in an empirical study of 2,926 GitHub repositories, examine whether and how they are adopted.<n>We then explore Context Files, Skills, and Subagents, that is, three mechanisms available across tools, in more detail.
arXiv Detail & Related papers (2026-02-16T12:24:28Z)
Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? [3.2610504259514754]
We study whether context files are effective for real-world tasks.<n>We find that context files tend to reduce task success rates compared to providing no repository context.<n>We conclude that unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements.
arXiv Detail & Related papers (2026-02-12T14:15:22Z)
FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents [53.03492387564392]
We introduce FS-Researcher, a file-system-based framework that scales deep research beyond the context window via a persistent workspace.<n>A Context Builder agent browses the internet, writes structured notes, and archives raw sources into a hierarchical knowledge base that can grow far beyond context length.<n>A Report Writer agent then composes the final report section by section, treating the knowledge base as the source of facts.
arXiv Detail & Related papers (2026-02-02T03:00:19Z)
Context Engineering for AI Agents in Open-Source Software [13.236926479239754]
GenAI-based coding assistants have disrupted software development.<n>Their next generation is agent-based, operating with more autonomy and potentially without human oversight.<n>One challenge is to provide AI agents with sufficient context about the software projects they operate in.
arXiv Detail & Related papers (2025-10-24T12:55:48Z)
Trace: Securing Smart Contract Repository Against Access Control Vulnerability [58.02691083789239]
GitHub hosts numerous smart contract repositories containing source code, documentation, and configuration files.<n>Third-party developers often reference, reuse, or fork code from these repositories during custom development.<n>Existing tools for detecting smart contract vulnerabilities are limited in their ability to handle complex repositories.
arXiv Detail & Related papers (2025-10-22T05:18:28Z)
RepoSummary: Feature-Oriented Summarization and Documentation Generation for Code Repositories [7.744086870383438]
RepoSummary is a feature-oriented code repository summarization approach.<n>It simultaneously generates repository documentation automatically.<n>It establishes more accurate traceability links from functional features to the corresponding code elements.
arXiv Detail & Related papers (2025-10-13T06:16:44Z)
On the Use of Agentic Coding Manifests: An Empirical Study of Claude Code [0.0]
Agentic coding tools receive goals written in natural language as input, break them down into specific tasks, and write/execute the actual code with minimal human intervention.<n>Key to this process are agent manifests, configuration files (such as Claude.md) that provide agents with essential project context, identity, and operational rules.<n>We analyzed 253 Claude.md files from 242 repositories to identify structural patterns and common content.
arXiv Detail & Related papers (2025-09-18T08:46:41Z)
AgentArmor: Enforcing Program Analysis on Agent Runtime Trace to Defend Against Prompt Injection [14.522205401511727]
Large Language Model (LLM) agents offer a powerful new paradigm for solving various problems by combining natural language reasoning with the execution of external tools.<n>In this work, we propose a novel insight that treats the agent runtime traces as structured programs with analyzable semantics.<n>We present AgentArmor, a program analysis framework that converts agent traces into graph intermediate representation-based structured program dependency representations.
arXiv Detail & Related papers (2025-08-02T07:59:34Z)
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios [51.46347732659174]
Large Language Models (LLMs) have demonstrated advanced capabilities in real-world agentic applications.<n>AgentIF is the first benchmark for systematically evaluating LLM instruction following ability in agentic scenarios.
arXiv Detail & Related papers (2025-05-22T17:31:10Z)
SOPBench: Evaluating Language Agents at Following Standard Operating Procedures and Constraints [59.645885492637845]
SOPBench is an evaluation pipeline that transforms each service-specific SOP code program into a directed graph of executable functions.<n>Our approach transforms each service-specific SOP code program into a directed graph of executable functions and requires agents to call these functions based on natural language SOP descriptions.<n>We evaluate 18 leading models, and results show the task is challenging even for top-tier models.
arXiv Detail & Related papers (2025-03-11T17:53:02Z)
On The Importance of Reasoning for Context Retrieval in Repository-Level Code Editing [82.96523584351314]
We decouple the task of context retrieval from the other components of the repository-level code editing pipelines. We conclude that while the reasoning helps to improve the precision of the gathered context, it still lacks the ability to identify its sufficiency.
arXiv Detail & Related papers (2024-06-06T19:44:17Z)
CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context [82.88371379927112]
We propose a framework that incorporates cross-file context to learn the in-file and cross-file context jointly on top of pretrained code LMs. CoCoMIC successfully improves the existing code LM with a 33.94% relative increase in exact match and a 28.69% relative increase in identifier matching for code completion when the cross-file context is provided.
arXiv Detail & Related papers (2022-12-20T05:48:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.