Related papers: Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications

Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications

URL: http://arxiv.org/abs/2502.13358v1
Date: Wed, 19 Feb 2025 01:41:44 GMT
Title: Bridging the Editing Gap in LLMs: FineEdit for Precise and Targeted Text Modifications
Authors: Yiming Zeng, Wanhao Yu, Zexin Li, Tao Ren, Yu Ma, Jinghan Cao, Xiyan Chen, Tingting Yu,
Abstract summary: Large Language Models (LLMs) have transformed natural language processing, yet they still struggle with direct text editing tasks.<n>In this work, we introduce a dual approach to enhance LLM editing performance.<n>First, we present InstrEditBench, a high-quality benchmark dataset comprising over 20,000 structured editing tasks.<n>Second, we propose FineEdit, a specialized model trained on this curated benchmark.
Score: 9.795246551841586
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have transformed natural language processing, yet they still struggle with direct text editing tasks that demand precise, context-aware modifications. While models like ChatGPT excel in text generation and analysis, their editing abilities often fall short, addressing only superficial issues rather than deeper structural or logical inconsistencies. In this work, we introduce a dual approach to enhance LLMs editing performance. First, we present InstrEditBench, a high-quality benchmark dataset comprising over 20,000 structured editing tasks spanning Wiki articles, LaTeX documents, code, and database Domain-specific Languages (DSL). InstrEditBench is generated using an innovative automated workflow that accurately identifies and evaluates targeted edits, ensuring that modifications adhere strictly to specified instructions without altering unrelated content. Second, we propose FineEdit, a specialized model trained on this curated benchmark. Experimental results demonstrate that FineEdit achieves significant improvements around {10\%} compared with Gemini on direct editing tasks, convincingly validating its effectiveness.

Related papers

RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions [56.9437856499838]
We introduce RefEdit -- an instruction-based editing model trained on our scalable synthetic data generation pipeline.<n>Our RefEdit, trained on only 20,000 editing triplets, outperforms the Flux/SD3 model-based baselines trained on millions of data.
arXiv Detail & Related papers (2025-06-03T23:20:24Z)
InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing [77.47790551485721]
In-context learning is a promising editing method by comprehending edit information through context encoding.<n>This method is constrained by the limited context window of large language models.<n>We propose InComeS, a flexible framework that enhances LLMs' ability to process editing contexts.
arXiv Detail & Related papers (2025-05-28T09:20:18Z)
The Mirage of Model Editing: Revisiting Evaluation in the Wild [70.17413507444704]
We introduce QAEdit, a new benchmark aligned with widely used question answering (QA) datasets, and WILD, a task-agnostic evaluation framework.<n>Our single editing experiments show that current editing methods perform substantially worse than previously reported.
arXiv Detail & Related papers (2025-02-16T15:57:55Z)
K-Edit: Language Model Editing with Contextual Knowledge Awareness [71.73747181407323]
Knowledge-based model editing enables precise modifications to the weights of large language models.<n>We present K-Edit, an effective approach to generating contextually consistent knowledge edits.
arXiv Detail & Related papers (2025-02-15T01:35:13Z)
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea [88.79769371584491]
We present AnyEdit, a comprehensive multi-modal instruction editing dataset.<n>We ensure the diversity and quality of the AnyEdit collection through three aspects: initial data diversity, adaptive editing process, and automated selection of editing results.<n>Experiments on three benchmark datasets show that AnyEdit consistently boosts the performance of diffusion-based editing models.
arXiv Detail & Related papers (2024-11-24T07:02:56Z)
DocEdit-v2: Document Structure Editing Via Multimodal LLM Grounding [128.92659116774374]
We introduce DocEdit-v2, a novel framework that performs end-to-end document editing by leveraging Large Multimodal Models (LMMs) It consists of three novel components: (1) Doc2Command, which simultaneously localizes edit regions of interest (RoI) and disambiguates user edit requests into edit commands; (2) LLM-based Command Reformulation prompting to tailor edit commands originally intended for specialized software into edit instructions suitable for generalist LMMs; and (3) Moreover, DocEdit-v2 processes these outputs via Large Multimodal Models like GPT-4V and Gemini, to parse the document layout, execute edits on
arXiv Detail & Related papers (2024-10-21T19:59:04Z)
StruEdit: Structured Outputs Enable the Fast and Accurate Knowledge Editing for Large Language Models [41.45831411548188]
StruEdit consistently delivers the highest accuracy with lowest latency compared with other knowledge editing methods. Results show that StruEdit consistently delivers the highest accuracy with lowest latency compared with other knowledge editing methods.
arXiv Detail & Related papers (2024-09-16T09:48:56Z)
InstructEdit: Instruction-based Knowledge Editing for Large Language Models [39.2147118489123]
We develop an instruction-based editing technique, termed InstructEdit, which facilitates the editor's adaptation to various task performances simultaneously using simple instructions. Experiments involving holdout unseen task illustrate that InstructEdit consistently surpass previous strong baselines.
arXiv Detail & Related papers (2024-02-25T15:46:33Z)
Knowledge Editing on Black-box Large Language Models [37.17131278142237]
Knowledge editing aims to efficiently and precisely modify the behavior of large language models (LLMs) to update specific knowledge. Current research primarily focuses on white-box LLMs editing, overlooking an important scenario: black-box LLMs editing. We introduce KE on black-box LLMs and then propose a comprehensive evaluation framework to overcome the limitations of existing evaluations. Experiments and analysis on two benchmarks demonstrate that postEdit outperforms all baselines and achieves strong generalization.
arXiv Detail & Related papers (2024-02-13T17:59:34Z)
DUnE: Dataset for Unified Editing [3.7346004746366384]
We introduce DUnE-an editing benchmark where edits are natural language sentences. We show that retrieval-augmented language modeling can outperform specialized editing techniques.
arXiv Detail & Related papers (2023-11-27T18:56:14Z)
Beyond the Chat: Executable and Verifiable Text-Editing with LLMs [87.84199761550634]
Conversational interfaces powered by Large Language Models (LLMs) have recently become a popular way to obtain feedback during document editing. We present InkSync, an editing interface that suggests executable edits directly within the document being edited.
arXiv Detail & Related papers (2023-09-27T00:56:17Z)
XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates [7.660511135287692]
This paper introduces XATU, the first benchmark specifically designed for fine-grained instruction-based explainable text editing. XATU considers finer-grained text editing tasks of varying difficulty, incorporating lexical, syntactic, semantic, and knowledge-intensive edit aspects. We demonstrate the effectiveness of instruction tuning and the impact of underlying architecture across various editing tasks.
arXiv Detail & Related papers (2023-09-20T04:58:59Z)
CoEdIT: Text Editing by Task-Specific Instruction Tuning [18.824571167583432]
CoEdIT is a state-of-the-art text editing system for writing assistance. It takes instructions from the user specifying the attributes of the desired text, and outputs the edited text. We present a large language model fine-tuned on a diverse collection of task-specific instructions for text editing.
arXiv Detail & Related papers (2023-05-17T00:05:24Z)
CoditT5: Pretraining for Source Code and Natural Language Editing [34.77621217370665]
CoditT5 is a large language model for software-related editing tasks that is pretrained on large amounts of source code and natural language comments. We fine-tune it on various downstream editing tasks, including comment updating, bug fixing, and automated code review.
arXiv Detail & Related papers (2022-08-10T16:59:40Z)
Learning Structural Edits via Incremental Tree Transformations [102.64394890816178]
We present a generic model for incremental editing of structured data (i.e., "structural edits") Our editor learns to iteratively generate tree edits (e.g., deleting or adding a subtree) and applies them to the partially edited data. We evaluate our proposed editor on two source code edit datasets, where results show that, with the proposed edit encoder, our editor significantly improves accuracy over previous approaches.
arXiv Detail & Related papers (2021-01-28T16:11:32Z)
Text Editing by Command [82.50904226312451]
A prevailing paradigm in neural text generation is one-shot generation, where text is produced in a single step. We address this limitation with an interactive text generation setting in which the user interacts with the system by issuing commands to edit existing text. We show that our Interactive Editor, a transformer-based model trained on this dataset, outperforms baselines and obtains positive results in both automatic and human evaluations.
arXiv Detail & Related papers (2020-10-24T08:00:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.