EditLord: Learning Code Transformation Rules for Code Editing
- URL: http://arxiv.org/abs/2504.15284v2
- Date: Wed, 23 Apr 2025 18:37:18 GMT
- Title: EditLord: Learning Code Transformation Rules for Code Editing
- Authors: Weichen Li, Albert Jan, Baishakhi Ray, Chengzhi Mao, Junfeng Yang, Kexin Pei,
- Abstract summary: Existing approaches often formulate code editing as an implicit end-to-end task, omitting the fact that code-editing procedures inherently consist of discrete and explicit steps.<n>We introduce EditLord, a code editing framework that makes the code transformation steps explicit.<n>Our key insight is to employ a language model (LM) as an inductive learner to extract code editing rules from the training code pairs as concise meta-rule sets.
- Score: 26.41680850940224
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Code editing is a foundational task in software development, where its effectiveness depends on whether it introduces desired code property changes without changing the original code's intended functionality. Existing approaches often formulate code editing as an implicit end-to-end task, omitting the fact that code-editing procedures inherently consist of discrete and explicit steps. Thus, they suffer from suboptimal performance and lack of robustness and generalization. We introduce EditLord, a code editing framework that makes the code transformation steps explicit. Our key insight is to employ a language model (LM) as an inductive learner to extract code editing rules from the training code pairs as concise meta-rule sets. Such rule sets will be manifested for each training sample to augment them for finetuning or assist in prompting- and iterative-based code editing. EditLordoutperforms the state-of-the-art by an average of 22.7% in editing performance and 58.1% in robustness while achieving 20.2% higher functional correctness across critical software engineering and security applications, LM models, and editing modes.
Related papers
- Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications [9.795246551841586]
Large Language Models (LLMs) have transformed natural language processing, yet they still struggle with direct text editing tasks.
In this work, we introduce a dual approach to enhance LLM editing performance.
First, we present InstrEditBench, a high-quality benchmark dataset comprising over 20,000 structured editing tasks.
Second, we propose FineEdit, a specialized model trained on this curated benchmark.
arXiv Detail & Related papers (2025-02-19T01:41:44Z) - AnyEdit: Edit Any Knowledge Encoded in Language Models [69.30638272162267]
We propose AnyEdit, a new autoregressive editing paradigm for large language models (LLMs)
It decomposes long-form knowledge into sequential chunks and iteratively edits the key token in each chunk, ensuring consistent and accurate outputs.
It outperforms strong baselines by 21.5% on benchmarks including UnKEBench, AKEW, and our new EditEverything dataset for long-form diverse-formatted knowledge.
arXiv Detail & Related papers (2025-02-08T16:18:37Z) - CodeEditorBench: Evaluating Code Editing Capability of Large Language Models [49.387195629660994]
Large Language Models (LLMs) for code are rapidly evolving, with code editing emerging as a critical capability.<n>We introduce CodeEditorBench, an evaluation framework designed to rigorously assess the performance of LLMs in code editing tasks.<n>We curate diverse coding challenges and scenarios from five sources, covering various programming languages, complexity levels, and editing tasks.
arXiv Detail & Related papers (2024-04-04T15:49:49Z) - InstructCoder: Instruction Tuning Large Language Models for Code Editing [26.160498475809266]
We explore the use of Large Language Models (LLMs) to edit code based on user instructions.
InstructCoder is the first instruction-tuning dataset designed to adapt LLMs for general-purpose code editing.
Our findings reveal that open-source LLMs fine-tuned on InstructCoder can significantly enhance the accuracy of code edits.
arXiv Detail & Related papers (2023-10-31T10:15:35Z) - Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing [57.776971051512234]
In this work, we explore a multi-round code auto-editing setting, aiming to predict edits to a code region based on recent changes within the same.
Our model, Coeditor, is a fine-tuned language model specifically designed for code editing tasks.
In a simplified single-round, single-edit task, Coeditor significantly outperforms GPT-3.5 and SOTA open-source code completion models.
arXiv Detail & Related papers (2023-05-29T19:57:36Z) - GrACE: Generation using Associated Code Edits [23.643567386291988]
We endowing pre-trained large language models (LLMs) of code with the knowledge of prior, relevant edits.
The generative capability of the LLMs helps address the diversity in code changes and conditioning code generation on prior edits.
We evaluate two well-known LLMs, Codex and CodeT5, in zero-shot and fine-tuning settings respectively.
arXiv Detail & Related papers (2023-05-23T14:55:44Z) - CodeT5+: Open Code Large Language Models for Code Understanding and
Generation [72.1638273937025]
Large language models (LLMs) pretrained on vast source code have achieved prominent progress in code intelligence.
CodeT5+ is a family of encoder-decoder LLMs for code in which component modules can be flexibly combined to suit a wide range of downstream code tasks.
We extensively evaluate CodeT5+ on over 20 code-related benchmarks in different settings, including zero-shot, finetuning, and instruction-tuning.
arXiv Detail & Related papers (2023-05-13T14:23:07Z) - CodeEditor: Learning to Edit Source Code with Pre-trained Models [47.736781998792]
This paper presents an effective pre-trained code editing model named CodeEditor.
We collect lots of real-world code snippets as the ground truth and use a powerful generator to rewrite them into mutated versions.
We conduct experiments on four code editing datasets and evaluate the pre-trained CodeEditor in three settings.
arXiv Detail & Related papers (2022-10-31T03:26:33Z) - CoditT5: Pretraining for Source Code and Natural Language Editing [34.77621217370665]
CoditT5 is a large language model for software-related editing tasks that is pretrained on large amounts of source code and natural language comments.
We fine-tune it on various downstream editing tasks, including comment updating, bug fixing, and automated code review.
arXiv Detail & Related papers (2022-08-10T16:59:40Z) - Unsupervised Learning of General-Purpose Embeddings for Code Changes [6.652641137999891]
We propose an approach for obtaining embeddings of code changes during pre-training.
We evaluate them on two different downstream tasks - applying changes to code and commit message generation.
Our model outperforms the model that uses full edit sequences by 5.9 percentage points in accuracy.
arXiv Detail & Related papers (2021-06-03T19:08:53Z) - Learning Structural Edits via Incremental Tree Transformations [102.64394890816178]
We present a generic model for incremental editing of structured data (i.e., "structural edits")
Our editor learns to iteratively generate tree edits (e.g., deleting or adding a subtree) and applies them to the partially edited data.
We evaluate our proposed editor on two source code edit datasets, where results show that, with the proposed edit encoder, our editor significantly improves accuracy over previous approaches.
arXiv Detail & Related papers (2021-01-28T16:11:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.