Related papers: Codified Context: Infrastructure for AI Agents in a Complex Codebase

Codified Context: Infrastructure for AI Agents in a Complex Codebase

URL: http://arxiv.org/abs/2602.20478v1
Date: Tue, 24 Feb 2026 02:11:26 GMT
Title: Codified Context: Infrastructure for AI Agents in a Complex Codebase
Authors: Aristidis Vasilopoulos,
Abstract summary: This paper presents a three-component codified context infrastructure developed during construction of a 108,000-line C# distributed system.<n>The framework is published as an open-source companion repository.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: LLM-based agentic coding assistants lack persistent memory: they lose coherence across sessions, forget project conventions, and repeat known mistakes. Recent studies characterize how developers configure agents through manifest files, but an open challenge remains how to scale such configurations for large, multi-agent projects. This paper presents a three-component codified context infrastructure developed during construction of a 108,000-line C# distributed system: (1) a hot-memory constitution encoding conventions, retrieval hooks, and orchestration protocols; (2) 19 specialized domain-expert agents; and (3) a cold-memory knowledge base of 34 on-demand specification documents. Quantitative metrics on infrastructure growth and interaction patterns across 283 development sessions are reported alongside four observational case studies illustrating how codified context propagates across sessions to prevent failures and maintain consistency. The framework is published as an open-source companion repository.

Related papers

Architecture-Aware Multi-Design Generation for Repository-Level Feature Addition [53.50448142467294]
RAIM is a multi-design and architecture-aware framework for repository-level feature addition.<n>It shifts away from linear patching by generating multiple diverse implementation designs.<n>Experiments on the NoCode-bench Verified dataset demonstrate that RAIM establishes a new state-of-the-art performance.
arXiv Detail & Related papers (2026-03-02T12:50:40Z)
OpenClaw, Moltbook, and ClawdLab: From Agent-Only Social Networks to Autonomous Scientific Research [0.18995650644735798]
ClawdLab is an open-source platform for autonomous scientific research.<n>The literature documents security vulnerabilities spanning 131 agent skills and over 15,200 exposed control panels.<n>ClawdLab addresses these failure modes through hard role restrictions, structured adversarial critique, PI-led governance, multi-model orchestration, and domain-specific evidence requirements.
arXiv Detail & Related papers (2026-02-23T13:10:01Z)
Multi-CoLoR: Context-Aware Localization and Reasoning across Multi-Language Codebases [1.4216413758677147]
We present Multi-CoLoR, a framework for Context-aware localization and reasoning across Multi-Languages.<n>It integrates organizational knowledge retrieval with graph-based reasoning to traverse complex software ecosystems.
arXiv Detail & Related papers (2026-02-23T00:54:59Z)
Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? [3.2610504259514754]
We study whether context files are effective for real-world tasks.<n>We find that context files tend to reduce task success rates compared to providing no repository context.<n>We conclude that unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements.
arXiv Detail & Related papers (2026-02-12T14:15:22Z)
Do Not Treat Code as Natural Language: Implications for Repository-Level Code Generation and Beyond [13.550121154853715]
We present Hydra, a repository-level code generation framework that treats code as structured code rather than natural language.<n>We show that Hydra achieves state-of-the-art performance across open- and closed-source CodeLLMs.
arXiv Detail & Related papers (2026-02-12T07:44:00Z)
FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents [53.03492387564392]
We introduce FS-Researcher, a file-system-based framework that scales deep research beyond the context window via a persistent workspace.<n>A Context Builder agent browses the internet, writes structured notes, and archives raw sources into a hierarchical knowledge base that can grow far beyond context length.<n>A Report Writer agent then composes the final report section by section, treating the knowledge base as the source of facts.
arXiv Detail & Related papers (2026-02-02T03:00:19Z)
Agent READMEs: An Empirical Study of Context Files for Agentic Coding [8.019313057979522]
We study 2,303 agent context files from 1,925 repositories to characterize their structure, maintenance, and content.<n>We find that these files are not static documentation but complex, difficult-to-read artifacts that evolve like configuration code, maintained through frequent, small additions.<n>These findings indicate that while developers use context files to make agents functional, they provide few guardrails to ensure that agent-written code is secure or performant, highlighting the need for improved tooling and practices.
arXiv Detail & Related papers (2025-11-17T02:18:55Z)
Analyzing and Internalizing Complex Policy Documents for LLM Agents [53.14898416858099]
Large Language Model (LLM)-based agentic systems rely on in-context policy documents encoding diverse business rules.<n>This motivates developing internalization methods that embed policy documents into model priors while preserving performance.<n>We introduce CC-Gen, an agentic benchmark generator with Controllable Complexity across four levels.
arXiv Detail & Related papers (2025-10-13T16:30:07Z)
Domain-Specific Data Generation Framework for RAG Adaptation [58.20906914537952]
Retrieval-Augmented Generation (RAG) combines the language understanding and reasoning power of large language models with external retrieval to enable domain-grounded responses.<n>We propose RAGen, a framework for generating domain-grounded question-answer-context (QAC) triples tailored to diverse RAG adaptation approaches.
arXiv Detail & Related papers (2025-10-13T09:59:49Z)
Reasoning-Aware Prompt Orchestration: A Foundation Model for Multi-Agent Language Model Coordination [0.0]
We present a theoretically-grounded framework for dynamic prompt orchestration that enhances reasoning across multiple specialized agents.<n>This framework addresses three core challenges: logical consistency preservation during agent transitions, reasoning-aware prompt adaptation, and scalable coordination of distributed inference.<n> Experimental results on 1,000 synthetic multi-agent conversations demonstrate a 42% reduction in reasoning latency, a 23% improvement in logical consistency measured by ROUGE-L score, and an 89% success rate for task completion without context loss.
arXiv Detail & Related papers (2025-09-30T22:33:01Z)
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving [62.71545696485824]
We introduce AGENT KB, a universal memory infrastructure enabling seamless experience sharing across heterogeneous agent frameworks without retraining.<n>AGENT KB aggregates trajectories into a structured knowledge base and serves lightweight APIs.<n>We validate AGENT across major frameworks on GAIA, Humanity's Last Exam, GPQA, and SWE-bench.
arXiv Detail & Related papers (2025-07-08T17:59:22Z)
CLOVER: A Test Case Generation Benchmark with Coverage, Long-Context, and Verification [71.34070740261072]
This paper presents a benchmark, CLOVER, to evaluate models' capabilities in generating and completing test cases.<n>The benchmark is containerized for code execution across tasks, and we will release the code, data, and construction methodologies.
arXiv Detail & Related papers (2025-02-12T21:42:56Z)
On The Importance of Reasoning for Context Retrieval in Repository-Level Code Editing [82.96523584351314]
We decouple the task of context retrieval from the other components of the repository-level code editing pipelines. We conclude that while the reasoning helps to improve the precision of the gathered context, it still lacks the ability to identify its sufficiency.
arXiv Detail & Related papers (2024-06-06T19:44:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.