RepoAgent: An LLM-Powered Open-Source Framework for Repository-level
Code Documentation Generation
- URL: http://arxiv.org/abs/2402.16667v1
- Date: Mon, 26 Feb 2024 15:39:52 GMT
- Title: RepoAgent: An LLM-Powered Open-Source Framework for Repository-level
Code Documentation Generation
- Authors: Qinyu Luo, Yining Ye, Shihao Liang, Zhong Zhang, Yujia Qin, Yaxi Lu,
Yesai Wu, Xin Cong, Yankai Lin, Yingli Zhang, Xiaoyin Che, Zhiyuan Liu,
Maosong Sun
- Abstract summary: We introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code documentation.
We have validated the effectiveness of our approach, showing that RepoAgent excels in generating high-quality repository-level documentation.
- Score: 79.83270415843857
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative models have demonstrated considerable potential in software
engineering, particularly in tasks such as code generation and debugging.
However, their utilization in the domain of code documentation generation
remains underexplored. To this end, we introduce RepoAgent, a large language
model powered open-source framework aimed at proactively generating,
maintaining, and updating code documentation. Through both qualitative and
quantitative evaluations, we have validated the effectiveness of our approach,
showing that RepoAgent excels in generating high-quality repository-level
documentation. The code and results are publicly accessible at
https://github.com/OpenBMB/RepoAgent.
Related papers
- CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation.
We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks.
We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z) - Code Agents are State of the Art Software Testers [10.730852617039451]
We investigate the capability of LLM-based Code Agents for formalizing user issues into test cases.
We propose a novel benchmark based on popular GitHub repositories, containing real-world issues, ground-truth patches, and golden tests.
We find that LLMs generally perform surprisingly well at generating relevant test cases with Code Agents designed for code repair.
arXiv Detail & Related papers (2024-06-18T14:54:37Z) - REPOEXEC: Evaluate Code Generation with a Repository-Level Executable Benchmark [5.641402231731082]
We introduce RepoExec, a novel benchmark for evaluating code generation at the repository-level scale.
RepoExec focuses on three main aspects: executability, functional correctness through automated test case generation with high coverage rate, and carefully crafted cross-file contexts to accurately generate code.
arXiv Detail & Related papers (2024-06-17T10:45:22Z) - CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems
for Real-World Repo-level Coding Challenges [44.028079593225584]
Large Language Models (LLMs) have shown promise in automated code generation but typically excel only in simpler tasks.
Our research pivots towards evaluating LLMs in a more realistic setting -- real-world repo-level code generation.
We present CodeAgent, a novel LLM-based agent framework that employs external tools for effective repo-level code generation.
arXiv Detail & Related papers (2024-01-14T18:12:03Z) - RepoCoder: Repository-Level Code Completion Through Iterative Retrieval
and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process.
It incorporates a similarity-based retriever and a pre-trained code language model.
It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z) - Generation-Augmented Query Expansion For Code Retrieval [51.20943646688115]
We propose a generation-augmented query expansion framework.
Inspired by the human retrieval process - sketching an answer before searching.
We achieve new state-of-the-art results on the CodeSearchNet benchmark.
arXiv Detail & Related papers (2022-12-20T23:49:37Z) - CodeExp: Explanatory Code Document Generation [94.43677536210465]
Existing code-to-text generation models produce only high-level summaries of code.
We conduct a human study to identify the criteria for high-quality explanatory docstring for code.
We present a multi-stage fine-tuning strategy and baseline models for the task.
arXiv Detail & Related papers (2022-11-25T18:05:44Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.