FineEdit: Unlock Instruction-Based Text Editing for LLMs
- URL: http://arxiv.org/abs/2502.13358v2
- Date: Tue, 20 May 2025 18:27:30 GMT
- Title: FineEdit: Unlock Instruction-Based Text Editing for LLMs
- Authors: Yiming Zeng, Wanhao Yu, Zexin Li, Tao Ren, Yu Ma, Jinghan Cao, Xiyan Chen, Tingting Yu,
- Abstract summary: FineEdit is a specialized editing model explicitly trained for accurate, context-aware text modifications.<n>FineEdit outperforms state-of-the-art models on single-turn edits, up to 30% over Llama-3.2-3B, and exceeding Mistral-7B-OpenOrca performance by over 40% on direct editing tasks.
- Score: 9.795246551841586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have significantly advanced natural language processing, demonstrating strong capabilities in tasks such as text generation, summarization, and reasoning. Recently, their potential for automating precise text editing tasks across specialized domains, such as programming code, LaTeX, and structured database languages, has gained attention. However, current state-of-the-art LLMs still struggle with executing precise, instruction-driven edits, particularly when structural accuracy and strict adherence to domain conventions are required. To address these challenges, we introduce InstrEditBench, an automated benchmark dataset comprising over 30,000 structured editing tasks spanning diverse domains, including Wikipedia articles, LaTeX documents, source code, and database languages. Using this benchmark, we develop FineEdit, a specialized editing model explicitly trained for accurate, context-aware text modifications. Experimental evaluations demonstrate that FineEdit outperforms state-of-the-art models, achieving improvements of approximately 10% over Gemini models on single-turn edits, up to 30% over Llama-3.2-3B, and exceeding Mistral-7B-OpenOrca performance by over 40% on direct editing tasks. FineEdit also effectively generalizes to realistic multi-turn editing scenarios, highlighting its practical applicability.
Related papers
- RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions [56.9437856499838]
We introduce RefEdit -- an instruction-based editing model trained on our scalable synthetic data generation pipeline.<n>Our RefEdit, trained on only 20,000 editing triplets, outperforms the Flux/SD3 model-based baselines trained on millions of data.
arXiv Detail & Related papers (2025-06-03T23:20:24Z) - InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing [77.47790551485721]
In-context learning is a promising editing method by comprehending edit information through context encoding.<n>This method is constrained by the limited context window of large language models.<n>We propose InComeS, a flexible framework that enhances LLMs' ability to process editing contexts.
arXiv Detail & Related papers (2025-05-28T09:20:18Z) - The Mirage of Model Editing: Revisiting Evaluation in the Wild [70.17413507444704]
We introduce QAEdit, a new benchmark aligned with widely used question answering (QA) datasets, and WILD, a task-agnostic evaluation framework.<n>Our single editing experiments show that current editing methods perform substantially worse than previously reported.
arXiv Detail & Related papers (2025-02-16T15:57:55Z) - K-Edit: Language Model Editing with Contextual Knowledge Awareness [71.73747181407323]
Knowledge-based model editing enables precise modifications to the weights of large language models.<n>We present K-Edit, an effective approach to generating contextually consistent knowledge edits.
arXiv Detail & Related papers (2025-02-15T01:35:13Z) - AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea [88.79769371584491]
We present AnyEdit, a comprehensive multi-modal instruction editing dataset.<n>We ensure the diversity and quality of the AnyEdit collection through three aspects: initial data diversity, adaptive editing process, and automated selection of editing results.<n>Experiments on three benchmark datasets show that AnyEdit consistently boosts the performance of diffusion-based editing models.
arXiv Detail & Related papers (2024-11-24T07:02:56Z) - DocEdit-v2: Document Structure Editing Via Multimodal LLM Grounding [128.92659116774374]
We introduce DocEdit-v2, a novel framework that performs end-to-end document editing by leveraging Large Multimodal Models (LMMs)
It consists of three novel components: (1) Doc2Command, which simultaneously localizes edit regions of interest (RoI) and disambiguates user edit requests into edit commands; (2) LLM-based Command Reformulation prompting to tailor edit commands originally intended for specialized software into edit instructions suitable for generalist LMMs; and (3) Moreover, DocEdit-v2 processes these outputs via Large Multimodal Models like GPT-4V and Gemini, to parse the document layout, execute edits on
arXiv Detail & Related papers (2024-10-21T19:59:04Z) - StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge Editing for Large Language Models [41.45831411548188]
StruEdit consistently delivers the highest accuracy with lowest latency compared with other knowledge editing methods.
Results show that StruEdit consistently delivers the highest accuracy with lowest latency compared with other knowledge editing methods.
arXiv Detail & Related papers (2024-09-16T09:48:56Z) - InstructEdit: Instruction-based Knowledge Editing for Large Language Models [39.2147118489123]
We develop an instruction-based editing technique, termed InstructEdit, which facilitates the editor's adaptation to various task performances simultaneously using simple instructions.
Experiments involving holdout unseen task illustrate that InstructEdit consistently surpass previous strong baselines.
arXiv Detail & Related papers (2024-02-25T15:46:33Z) - Knowledge Editing on Black-box Large Language Models [37.17131278142237]
Knowledge editing aims to efficiently and precisely modify the behavior of large language models (LLMs) to update specific knowledge.
Current research primarily focuses on white-box LLMs editing, overlooking an important scenario: black-box LLMs editing.
We introduce KE on black-box LLMs and then propose a comprehensive evaluation framework to overcome the limitations of existing evaluations.
Experiments and analysis on two benchmarks demonstrate that postEdit outperforms all baselines and achieves strong generalization.
arXiv Detail & Related papers (2024-02-13T17:59:34Z) - DUnE: Dataset for Unified Editing [3.7346004746366384]
We introduce DUnE-an editing benchmark where edits are natural language sentences.
We show that retrieval-augmented language modeling can outperform specialized editing techniques.
arXiv Detail & Related papers (2023-11-27T18:56:14Z) - Beyond the Chat: Executable and Verifiable Text-Editing with LLMs [87.84199761550634]
Conversational interfaces powered by Large Language Models (LLMs) have recently become a popular way to obtain feedback during document editing.
We present InkSync, an editing interface that suggests executable edits directly within the document being edited.
arXiv Detail & Related papers (2023-09-27T00:56:17Z) - XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates [7.660511135287692]
This paper introduces XATU, the first benchmark specifically designed for fine-grained instruction-based explainable text editing.
XATU considers finer-grained text editing tasks of varying difficulty, incorporating lexical, syntactic, semantic, and knowledge-intensive edit aspects.
We demonstrate the effectiveness of instruction tuning and the impact of underlying architecture across various editing tasks.
arXiv Detail & Related papers (2023-09-20T04:58:59Z) - CoEdIT: Text Editing by Task-Specific Instruction Tuning [18.824571167583432]
CoEdIT is a state-of-the-art text editing system for writing assistance.
It takes instructions from the user specifying the attributes of the desired text, and outputs the edited text.
We present a large language model fine-tuned on a diverse collection of task-specific instructions for text editing.
arXiv Detail & Related papers (2023-05-17T00:05:24Z) - CoditT5: Pretraining for Source Code and Natural Language Editing [34.77621217370665]
CoditT5 is a large language model for software-related editing tasks that is pretrained on large amounts of source code and natural language comments.
We fine-tune it on various downstream editing tasks, including comment updating, bug fixing, and automated code review.
arXiv Detail & Related papers (2022-08-10T16:59:40Z) - Learning Structural Edits via Incremental Tree Transformations [102.64394890816178]
We present a generic model for incremental editing of structured data (i.e., "structural edits")
Our editor learns to iteratively generate tree edits (e.g., deleting or adding a subtree) and applies them to the partially edited data.
We evaluate our proposed editor on two source code edit datasets, where results show that, with the proposed edit encoder, our editor significantly improves accuracy over previous approaches.
arXiv Detail & Related papers (2021-01-28T16:11:32Z) - Text Editing by Command [82.50904226312451]
A prevailing paradigm in neural text generation is one-shot generation, where text is produced in a single step.
We address this limitation with an interactive text generation setting in which the user interacts with the system by issuing commands to edit existing text.
We show that our Interactive Editor, a transformer-based model trained on this dataset, outperforms baselines and obtains positive results in both automatic and human evaluations.
arXiv Detail & Related papers (2020-10-24T08:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.