Related papers: RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation

RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation

URL: http://arxiv.org/abs/2402.16667v1
Date: Mon, 26 Feb 2024 15:39:52 GMT
Title: RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation
Authors: Qinyu Luo, Yining Ye, Shihao Liang, Zhong Zhang, Yujia Qin, Yaxi Lu, Yesai Wu, Xin Cong, Yankai Lin, Yingli Zhang, Xiaoyin Che, Zhiyuan Liu, Maosong Sun
Abstract summary: We introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code documentation. We have validated the effectiveness of our approach, showing that RepoAgent excels in generating high-quality repository-level documentation.
Score: 79.83270415843857
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative models have demonstrated considerable potential in software engineering, particularly in tasks such as code generation and debugging. However, their utilization in the domain of code documentation generation remains underexplored. To this end, we introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code documentation. Through both qualitative and quantitative evaluations, we have validated the effectiveness of our approach, showing that RepoAgent excels in generating high-quality repository-level documentation. The code and results are publicly accessible at https://github.com/OpenBMB/RepoAgent.

Related papers

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories. PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files. We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z)
DocAgent: A Multi-Agent System for Automated Code Documentation Generation [7.653779364214401]
We introduce DocAgent, a novel multi-agent collaborative system using topological code processing for incremental context building. Specialized agents (Reader, Searcher, Writer, Verifier, Orchestrator) then collaboratively generate documentation. We also propose a multi-faceted evaluation framework assessing Completeness, Helpfulness, and Truthfulness.
arXiv Detail & Related papers (2025-04-11T17:50:08Z)
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework. Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z)
Supporting Software Maintenance with Dynamically Generated Document Hierarchies [41.407915858583344]
We present HGEN, a fully automated pipeline that transforms source code through a series of six stages into a well-organized hierarchy of formatted documents. We evaluate HGEN both quantitatively and qualitatively. Results show that HGEN produces artifact hierarchies similar in quality to manually constructed documentation, with much higher coverage of the core concepts than the baseline approach.
arXiv Detail & Related papers (2024-08-11T17:11:14Z)
CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation. We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks. We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z)
On the Impacts of Contexts on Repository-Level Code Generation [5.641402231731082]
We present textbfmethodnamews, a novel benchmark designed to evaluate repository-level code generation. We focus on three key aspects: executability, functional correctness through comprehensive test case generation, and accurate utilization of cross-file contexts.
arXiv Detail & Related papers (2024-06-17T10:45:22Z)
VersiCode: Towards Version-controllable Code Generation [58.82709231906735]
Large Language Models (LLMs) have made tremendous strides in code generation, but existing research fails to account for the dynamic nature of software development. We propose two novel tasks aimed at bridging this gap: version-specific code completion (VSCC) and version-aware code migration (VACM) We conduct an extensive evaluation on VersiCode, which reveals that version-controllable code generation is indeed a significant challenge.
arXiv Detail & Related papers (2024-06-11T16:15:06Z)
Alibaba LingmaAgent: Improving Automated Issue Resolution via Comprehensive Repository Exploration [64.19431011897515]
This paper presents Alibaba LingmaAgent, a novel Automated Software Engineering method designed to comprehensively understand and utilize whole software repositories for issue resolution. Our approach introduces a top-down method to condense critical repository information into a knowledge graph, reducing complexity, and employs a Monte Carlo tree search based strategy. In production deployment and evaluation at Alibaba Cloud, LingmaAgent automatically resolved 16.9% of in-house issues faced by development engineers, and solved 43.3% of problems after manual intervention.
arXiv Detail & Related papers (2024-06-03T15:20:06Z)
CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges [41.038584732889895]
Large Language Models (LLMs) have shown promise in automated code generation but typically excel only in simpler tasks. Our research pivots towards evaluating LLMs in a more realistic setting -- real-world repo-level code generation. We present CodeAgent, a novel LLM-based agent framework that employs external tools for effective repo-level code generation.
arXiv Detail & Related papers (2024-01-14T18:12:03Z)
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process. It incorporates a similarity-based retriever and a pre-trained code language model. It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z)
Generation-Augmented Query Expansion For Code Retrieval [51.20943646688115]
We propose a generation-augmented query expansion framework. Inspired by the human retrieval process - sketching an answer before searching. We achieve new state-of-the-art results on the CodeSearchNet benchmark.
arXiv Detail & Related papers (2022-12-20T23:49:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.