Related papers: Git Context Controller: Manage the Context of LLM-based Agents like Git

Git Context Controller: Manage the Context of LLM-based Agents like Git

URL: http://arxiv.org/abs/2508.00031v1
Date: Wed, 30 Jul 2025 08:01:45 GMT
Title: Git Context Controller: Manage the Context of LLM-based Agents like Git
Authors: Junde Wu,
Abstract summary: Large language model (LLM) based agents have shown impressive capabilities by interleaving internal reasoning with external tool use.<n>We introduce Git-Context-Controller (GCC), a structured context management framework inspired by software version control systems.<n>In a self-replication case study, a GCC-augmented agent builds a new CLI agent from scratch, achieving 40.7 task resolution, compared to only 11.7 without GCC.
Score: 6.521644491529639
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language model (LLM) based agents have shown impressive capabilities by interleaving internal reasoning with external tool use. However, as these agents are deployed in long-horizon workflows, such as coding for a big, long-term project, context management becomes a critical bottleneck. We introduce Git-Context-Controller (GCC), a structured context management framework inspired by software version control systems. GCC elevates context as versioned memory hierarchy like Git. It structures agent memory as a persistent file system with explicit operations: COMMIT, BRANCH, MERGE, and CONTEXT, enabling milestone-based checkpointing, exploration of alternative plans, and structured reflection. Our approach empowers agents to manage long-term goals, isolate architectural experiments, and recover or hand off memory across sessions and agents. Empirically, agents equipped with GCC achieve state-of-the-art performance on the SWE-Bench-Lite benchmark, resolving 48.00 of software bugs, outperforming 26 competitive systems. In a self-replication case study, a GCC-augmented agent builds a new CLI agent from scratch, achieving 40.7 task resolution, compared to only 11.7 without GCC. The code is released at: https://github.com/theworldofagents/GCC

Related papers

Meta-RAG on Large Codebases Using Code Summarization [11.415083231118142]
Large Language Model (LLM) systems have been at the forefront of applied Artificial Intelligence (AI) research in a multitude of domains.<n>We propose a multi-agent system to localize bugs in large pre-existings using information retrieval and LLMs.<n>Our system introduces a novel Retrieval Augmented Generation (RAG) approach, Meta-RAG, where we utilize summaries to condenses by an average of 79.8%, into a compact, structured, natural language representation.
arXiv Detail & Related papers (2025-08-04T17:01:10Z)
LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning [20.147009997147798]
We propose ColaUntangle, a new collaborative consultation framework for commit untangling.<n>ColaUntangle integrates Large Language Model (LLM)-driven agents in a multi-agent architecture.<n>We construct multi-version Program Dependency Graphs (delta-PDG), enabling agents to reason over code relationships with both symbolic and semantic depth.
arXiv Detail & Related papers (2025-07-22T09:42:13Z)
SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving [90.32201622392137]
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs)<n>Unlike traditional static benchmarks, SwingArena models the collaborative process of software by pairing LLMs as iterations, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines.
arXiv Detail & Related papers (2025-05-29T18:28:02Z)
RepoMaster: Autonomous Exploration and Understanding of GitHub Repositories for Complex Task Solving [9.477917878478188]
RepoMaster is an autonomous agent framework designed to explore and reuse GitHub repositories for solving complex tasks.<n>RepoMaster constructs function-call graphs, module-dependency graphs, and hierarchical code trees to identify essential components.<n>On our newly released GitTaskBench, RepoMaster lifts the task-pass rate from 24.1% to 62.9% while reducing token usage by 95%.
arXiv Detail & Related papers (2025-05-27T08:35:05Z)
LocAgent: Graph-Guided LLM Agents for Code Localization [25.395102705800916]
LocAgent is a framework that addresses code localization through graph-based representation.<n>Our method with the fine-tuned Qwen-2.5-Coder-Instruct-32B model achieves comparable results to SOTA proprietary models at greatly reduced cost.
arXiv Detail & Related papers (2025-03-12T05:55:01Z)
CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models [106.11371409170818]
Large language models (LLMs) can act as agents with capabilities to self-refine and improve generated code autonomously. We propose CodeTree, a framework for LLM agents to efficiently explore the search space in different stages of the code generation process. Specifically, we adopted a unified tree structure to explicitly explore different coding strategies, generate corresponding coding solutions, and subsequently refine the solutions.
arXiv Detail & Related papers (2024-11-07T00:09:54Z)
Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement [112.04307762405669]
G"odel Agent is a self-evolving framework inspired by the G"odel machine.<n>G"odel Agent can achieve continuous self-improvement, surpassing manually crafted agents in performance, efficiency, and generalizability.
arXiv Detail & Related papers (2024-10-06T10:49:40Z)
MemoRAG: Boosting Long Context Processing with Global Memory-Enhanced Retrieval Augmentation [60.04380907045708]
Retrieval-Augmented Generation (RAG) is considered a promising strategy to address this problem.<n>We propose MemoRAG, a novel RAG framework empowered by global memory-augmented retrieval.<n>MemoRAG achieves superior performances across a variety of long-context evaluation tasks.
arXiv Detail & Related papers (2024-09-09T13:20:31Z)
Long Code Arena: a Set of Benchmarks for Long-Context Code Models [75.70507534322336]
Long Code Arena is a suite of six benchmarks for code processing tasks that require project-wide context. These tasks cover different aspects of code processing: library-based code generation, CI builds repair, project-level code completion, commit message generation, bug localization, and module summarization. For each task, we provide a manually verified dataset for testing, an evaluation suite, and open-source baseline solutions.
arXiv Detail & Related papers (2024-06-17T14:58:29Z)
On the Impacts of Contexts on Repository-Level Code Generation [5.641402231731082]
We present RepoExec, a novel benchmark designed to evaluate repository-level code generation.<n>We focus on three key aspects: executability, functional correctness through comprehensive test case generation, and accurate utilization of cross-file contexts.
arXiv Detail & Related papers (2024-06-17T10:45:22Z)
Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration [64.19431011897515]
This paper presents Alibaba LingmaAgent, a novel Automated Software Engineering method designed to comprehensively understand and utilize whole software repositories for issue resolution.<n>Our approach introduces a top-down method to condense critical repository information into a knowledge graph, reducing complexity, and employs a Monte Carlo tree search based strategy.<n>In production deployment and evaluation at Alibaba Cloud, LingmaAgent automatically resolved 16.9% of in-house issues faced by development engineers, and solved 43.3% of problems after manual intervention.
arXiv Detail & Related papers (2024-06-03T15:20:06Z)
Enhancing Open-Domain Task-Solving Capability of LLMs via Autonomous Tool Integration from GitHub [79.31134731122462]
We introduce OpenAct benchmark to evaluate the open-domain task-solving capability, built on human expert consultation and repositories in GitHub.<n>We present OpenAgent, a novel LLM-based agent system that can tackle evolving queries in open domains through autonomously integrating specialized tools from GitHub.
arXiv Detail & Related papers (2023-12-28T15:47:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.