Related papers: CodeNav: Beyond tool-use to using real-world codebases with LLM agents

CodeNav: Beyond tool-use to using real-world codebases with LLM agents

URL: http://arxiv.org/abs/2406.12276v1
Date: Tue, 18 Jun 2024 05:10:38 GMT
Title: CodeNav: Beyond tool-use to using real-world codebases with LLM agents
Authors: Tanmay Gupta, Luca Weihs, Aniruddha Kembhavi,
Abstract summary: We present CodeNav, an LLM agent that navigates and leverages previously unseen code repositories to solve user queries. CodeNav automatically indexes and searches over code blocks in the target. It finds relevant code snippets, imports them, and uses them to iteratively generate a solution with execution feedback.
Score: 38.63753787418723
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present CodeNav, an LLM agent that navigates and leverages previously unseen code repositories to solve user queries. In contrast to tool-use LLM agents that require ``registration'' of all relevant tools via manual descriptions within the LLM context, CodeNav automatically indexes and searches over code blocks in the target codebase, finds relevant code snippets, imports them, and uses them to iteratively generate a solution with execution feedback. To highlight the core-capabilities of CodeNav, we first showcase three case studies where we use CodeNav for solving complex user queries using three diverse codebases. Next, on three benchmarks, we quantitatively compare the effectiveness of code-use (which only has access to the target codebase) to tool-use (which has privileged access to all tool names and descriptions). Finally, we study the effect of varying kinds of tool and library descriptions on code-use performance, as well as investigate the advantage of the agent seeing source code as opposed to natural descriptions of code. All code will be made open source under a permissive license.

Related papers

VersiCode: Towards Version-controllable Code Generation [58.82709231906735]
We introduce VersiCode, the first comprehensive dataset designed to assess the ability of large language models to generate verifiable code for specific library versions. We design two dedicated evaluation tasks: version-specific code completion (VSCC) and version-aware code editing (VACE) Comprehensive experiments are conducted to benchmark the performance of LLMs, revealing the challenging nature of these tasks and VersiCode.
arXiv Detail & Related papers (2024-06-11T16:15:06Z)
CodeCloak: A Method for Evaluating and Mitigating Code Leakage by LLM Code Assistants [23.462703429753706]
We propose two complementary methods to mitigate the risk of code leakage when using LLM-based code assistants. The first is a technique for reconstructing a developer's original from code segments sent to the code assistant service. The second is CodeCloak, a novel deep reinforcement learning agent that manipulates the prompts before sending them to the code assistant service.
arXiv Detail & Related papers (2024-04-13T19:30:58Z)
Executable Code Actions Elicit Better LLM Agents [76.95566120678787]
This work proposes to use Python code to consolidate Large Language Model (LLM) agents' actions into a unified action space (CodeAct) integrated with a Python interpreter, CodeAct can execute code actions and dynamically revise prior actions or emit new actions upon new observations through multi-turn interactions. The encouraging performance of CodeAct motivates us to build an open-source LLM agent that interacts with environments by executing interpretable code and collaborates with users using natural language.
arXiv Detail & Related papers (2024-02-01T21:38:58Z)
CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges [44.028079593225584]
Large Language Models (LLMs) have shown promise in automated code generation but typically excel only in simpler tasks. Our research pivots towards evaluating LLMs in a more realistic setting -- real-world repo-level code generation. We present CodeAgent, a novel LLM-based agent framework that employs external tools for effective repo-level code generation.
arXiv Detail & Related papers (2024-01-14T18:12:03Z)
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction [56.02100384015907]
EasyTool is a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction. It can significantly reduce token consumption and improve the performance of tool utilization in real-world scenarios.
arXiv Detail & Related papers (2024-01-11T15:45:11Z)
A^3-CodGen: A Repository-Level Code Generation Framework for Code Reuse with Local-Aware, Global-Aware, and Third-Party-Library-Aware [13.850755485655435]
We propose a novel code generation framework, dubbed A3-CodGen, to harness information within the code repository to generate code with fewer potential logical errors. We identify three categories of representative information for the code repository: local-aware information from current code file, global-aware information from other code files, and third-party-library information. Results demonstrate that by adopting the A3-CodGen framework, we successfully extract, fuse, and feed code repository information into the LLM, generating more accurate, efficient, and highly reusable code.
arXiv Detail & Related papers (2023-12-10T05:36:06Z)
InstructCoder: Instruction Tuning Large Language Models for Code Editing [26.160498475809266]
We explore the use of Large Language Models (LLMs) to edit code based on user instructions. InstructCoder is the first instruction-tuning dataset designed to adapt LLMs for general-purpose code editing. Our findings reveal that open-source LLMs fine-tuned on InstructCoder can significantly enhance the accuracy of code edits.
arXiv Detail & Related papers (2023-10-31T10:15:35Z)
Using an LLM to Help With Code Understanding [13.53616539787915]
Large language models (LLMs) are revolutionizing the process of writing code. Our plugin queries OpenAI's GPT-3.5-turbo model with four high-level requests without the user having to write explicit prompts. We evaluate this system in a user study with 32 participants, which confirms that using our plugin can aid task completion more than web search.
arXiv Detail & Related papers (2023-07-17T00:49:06Z)
Generation-Augmented Query Expansion For Code Retrieval [51.20943646688115]
We propose a generation-augmented query expansion framework. Inspired by the human retrieval process - sketching an answer before searching. We achieve new state-of-the-art results on the CodeSearchNet benchmark.
arXiv Detail & Related papers (2022-12-20T23:49:37Z)
DocCoder: Generating Code by Retrieving and Reading Docs [87.88474546826913]
We introduce DocCoder, an approach that explicitly leverages code manuals and documentation. Our approach is general, can be applied to any programming language, and is agnostic to the underlying neural model.
arXiv Detail & Related papers (2022-07-13T06:47:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.