Related papers: On the Impacts of Contexts on Repository-Level Code Generation

On the Impacts of Contexts on Repository-Level Code Generation

URL: http://arxiv.org/abs/2406.11927v4
Date: Sun, 09 Feb 2025 08:47:20 GMT
Title: On the Impacts of Contexts on Repository-Level Code Generation
Authors: Nam Le Hai, Dung Manh Nguyen, Nghi D. Q. Bui,
Abstract summary: We present RepoExec, a novel benchmark designed to evaluate repository-level code generation.<n>We focus on three key aspects: executability, functional correctness through comprehensive test case generation, and accurate utilization of cross-file contexts.
Score: 5.641402231731082
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: CodeLLMs have gained widespread adoption for code generation tasks, yet their capacity to handle repository-level code generation with complex contextual dependencies remains underexplored. Our work underscores the critical importance of leveraging repository-level contexts to generate executable and functionally correct code. We present RepoExec, a novel benchmark designed to evaluate repository-level code generation, with a focus on three key aspects: executability, functional correctness through comprehensive test case generation, and accurate utilization of cross-file contexts. Our study examines a controlled scenario where developers specify essential code dependencies (contexts), challenging models to integrate them effectively. Additionally, we introduce an instruction-tuned dataset that enhances CodeLLMs' ability to leverage dependencies, along with a new metric, Dependency Invocation Rate (DIR), to quantify context utilization. Experimental results reveal that while pretrained LLMs demonstrate superior performance in terms of correctness, instruction-tuned models excel in context utilization and debugging capabilities. RepoExec offers a comprehensive evaluation framework for assessing code functionality and alignment with developer intent, thereby advancing the development of more reliable CodeLLMs for real-world applications. The dataset and source code are available at https://github.com/FSoft-AI4Code/RepoExec.

Related papers

FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation [26.14778133391999]
FEA-Bench is a benchmark designed to assess the ability of large language models to perform incremental development within code repositories. We collect pull requests from 83 GitHub repositories and use rule-based and intent-based filtering to construct task instances focused on new feature development.
arXiv Detail & Related papers (2025-03-09T16:11:57Z)
Code Summarization Beyond Function Level [0.213063058314067]
This study investigated the effectiveness of code summarization models beyond the function level. The fine-tuned state-of-the-art CodeT5+ base model excelled in code summarization. Repository-level summarization exhibited promising potential but requires significant computational resources.
arXiv Detail & Related papers (2025-02-23T20:31:21Z)
ExecRepoBench: Multi-level Executable Code Completion Evaluation [45.963424627710765]
We introduce a novel framework for enhancing code completion in software development through the creation of a repository-level benchmark ExecRepoBench. We present a multi-level grammar-based completion methodology conditioned on the abstract syntax tree to mask code fragments at various logical units. Then, we fine-tune the open-source LLM with 7B parameters on Repo-Instruct to produce a strong code completion baseline model Qwen2.5-Coder-Instruct-C.
arXiv Detail & Related papers (2024-12-16T17:14:35Z)
ContextModule: Improving Code Completion via Repository-level Contextual Information [11.459065573651348]
ContextModule improves the relevance and precision of generated code. We implement performance optimizations, such as index caching, to ensure the system meets the latency constraints of real-world coding environments.
arXiv Detail & Related papers (2024-12-11T03:15:49Z)
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework. Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z)
CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation. We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks. We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z)
GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model [30.625128161499195]
GraphCoder is a retrieval-augmented code completion framework. It uses general code knowledge and the repository-specific knowledge via a graph-based retrieval-generation process. It achieves higher exact match (EM) on average, with increases of +6.06 in code match and +6.23 in identifier match, while using less time and space.
arXiv Detail & Related papers (2024-06-11T06:55:32Z)
On The Importance of Reasoning for Context Retrieval in Repository-Level Code Editing [82.96523584351314]
We decouple the task of context retrieval from the other components of the repository-level code editing pipelines. We conclude that while the reasoning helps to improve the precision of the gathered context, it still lacks the ability to identify its sufficiency.
arXiv Detail & Related papers (2024-06-06T19:44:17Z)
Class-Level Code Generation from Natural Language Using Iterative, Tool-Enhanced Reasoning over Repository [4.767858874370881]
We introduce RepoClassBench, a benchmark designed to rigorously evaluate LLMs in generating class-level code within real-world repositories. RepoClassBench includes "Natural Language to Class generation" tasks across Java, Python & C# from a selection of repositories. We introduce Retrieve-Repotools-Reflect (RRR), a novel approach that equips LLMs with static analysis tools to iteratively navigate & reason about repository-level context.
arXiv Detail & Related papers (2024-04-22T03:52:54Z)
Enhancing LLM-Based Coding Tools through Native Integration of IDE-Derived Static Context [41.91246546266515]
We argue that Integrated Development Environments (IDEs) can provide direct, accurate and real-time cross-file information for repository-level code completion. We propose a framework that leveragesIDE native static contexts for cross-context construction and diagnosis results for self-refinement.
arXiv Detail & Related papers (2024-02-06T01:59:41Z)
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components. CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks. FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization. Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation [96.75695811963242]
RepoCoder is a framework to streamline the repository-level code completion process. It incorporates a similarity-based retriever and a pre-trained code language model. It consistently outperforms the vanilla retrieval-augmented code completion approach.
arXiv Detail & Related papers (2023-03-22T13:54:46Z)
ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.