DocAgent: A Multi-Agent System for Automated Code Documentation Generation
- URL: http://arxiv.org/abs/2504.08725v2
- Date: Fri, 18 Apr 2025 04:32:43 GMT
- Title: DocAgent: A Multi-Agent System for Automated Code Documentation Generation
- Authors: Dayu Yang, Antoine Simoulin, Xin Qian, Xiaoyi Liu, Yuwei Cao, Zhaopu Teng, Grey Yang,
- Abstract summary: We introduce DocAgent, a novel multi-agent collaborative system using topological code processing for incremental context building.<n>Specialized agents (Reader, Searcher, Writer, Verifier, Orchestrator) then collaboratively generate documentation.<n>We also propose a multi-faceted evaluation framework assessing Completeness, Helpfulness, and Truthfulness.
- Score: 7.653779364214401
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High-quality code documentation is crucial for software development especially in the era of AI. However, generating it automatically using Large Language Models (LLMs) remains challenging, as existing approaches often produce incomplete, unhelpful, or factually incorrect outputs. We introduce DocAgent, a novel multi-agent collaborative system using topological code processing for incremental context building. Specialized agents (Reader, Searcher, Writer, Verifier, Orchestrator) then collaboratively generate documentation. We also propose a multi-faceted evaluation framework assessing Completeness, Helpfulness, and Truthfulness. Comprehensive experiments show DocAgent significantly outperforms baselines consistently. Our ablation study confirms the vital role of the topological processing order. DocAgent offers a robust approach for reliable code documentation generation in complex and proprietary repositories.
Related papers
- Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning [57.09163579304332]
We introduce PaperCoder, a framework that transforms machine learning papers into functional code repositories.
PaperCoder operates in three stages: planning, designs the system architecture with diagrams, identifies file dependencies, and generates configuration files.
We then evaluate PaperCoder on generating code implementations from machine learning papers based on both model-based and human evaluations.
arXiv Detail & Related papers (2025-04-24T01:57:01Z) - MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding [40.52017994491893]
MDocAgent is a novel RAG and multi-agent framework that leverages both text and image.<n>Our system employs five specialized agents: a general agent, a critical agent, a text agent, an image agent and a summarizing agent.<n>Preliminary experiments on five benchmarks demonstrate the effectiveness of our MDocAgent, achieve an average improvement of 12.1%.
arXiv Detail & Related papers (2025-03-18T06:57:21Z) - Doc-Guided Sent2Sent++: A Sent2Sent++ Agent with Doc-Guided memory for Document-level Machine Translation [11.36816954288264]
This paper introduces Doc-Guided Sent2Sent++, an Agent that employs an incremental sentence-level forced decoding strategy.<n>We demonstrate that Sent2Sent++ outperforms other methods in terms of quality, consistency, and fluency.
arXiv Detail & Related papers (2025-01-15T02:25:35Z) - BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks [57.589795399265945]
We introduce BigDocs-7.5M, a high-quality, open-access dataset comprising 7.5 million multimodal documents across 30 tasks.<n>We also introduce BigDocs-Bench, a benchmark suite with 10 novel tasks.<n>Our experiments show that training with BigDocs-Bench improves average performance up to 25.8% over closed-source GPT-4o.
arXiv Detail & Related papers (2024-12-05T21:41:20Z) - Commit0: Library Generation from Scratch [77.38414688148006]
Commit0 is a benchmark that challenges AI agents to write libraries from scratch.
Agents are provided with a specification document outlining the library's API as well as a suite of interactive unit tests.
Commit0 also offers an interactive environment where models receive static analysis and execution feedback on the code they generate.
arXiv Detail & Related papers (2024-12-02T18:11:30Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems [80.69865295743149]
This work attempts to study using LLM-based agents to design collaborative AI systems autonomously.<n>Based on ComfyBench, we develop ComfyAgent, a framework that empowers agents to autonomously design collaborative AI systems by generating.<n>While ComfyAgent achieves a comparable resolve rate to o1-preview and significantly surpasses other agents on ComfyBench, ComfyAgent has resolved only 15% of creative tasks.
arXiv Detail & Related papers (2024-09-02T17:44:10Z) - Supporting Software Maintenance with Dynamically Generated Document Hierarchies [41.407915858583344]
We present HGEN, a fully automated pipeline that transforms source code through a series of six stages into a well-organized hierarchy of formatted documents.
We evaluate HGEN both quantitatively and qualitatively.
Results show that HGEN produces artifact hierarchies similar in quality to manually constructed documentation, with much higher coverage of the core concepts than the baseline approach.
arXiv Detail & Related papers (2024-08-11T17:11:14Z) - CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation.<n>We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks.<n>We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z) - RepoAgent: An LLM-Powered Open-Source Framework for Repository-level
Code Documentation Generation [79.83270415843857]
We introduce RepoAgent, a large language model powered open-source framework aimed at proactively generating, maintaining, and updating code documentation.
We have validated the effectiveness of our approach, showing that RepoAgent excels in generating high-quality repository-level documentation.
arXiv Detail & Related papers (2024-02-26T15:39:52Z) - DocGen: Generating Detailed Parameter Docstrings in Python [0.0]
We propose a multi-step approach that combines multiple task-specific models, each adept at producing a specific section of a docstring.
We compared the results from our approach with existing generative models using both automatic metrics and a human-centred evaluation with 17 participating developers.
arXiv Detail & Related papers (2023-11-11T01:14:37Z) - DAPR: A Benchmark on Document-Aware Passage Retrieval [57.45793782107218]
We propose and name this task emphDocument-Aware Passage Retrieval (DAPR)
While analyzing the errors of the State-of-The-Art (SoTA) passage retrievers, we find the major errors (53.5%) are due to missing document context.
Our created benchmark enables future research on developing and comparing retrieval systems for the new task.
arXiv Detail & Related papers (2023-05-23T10:39:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.