Beyond the Chat: Executable and Verifiable Text-Editing with LLMs
- URL: http://arxiv.org/abs/2309.15337v1
- Date: Wed, 27 Sep 2023 00:56:17 GMT
- Title: Beyond the Chat: Executable and Verifiable Text-Editing with LLMs
- Authors: Philippe Laban, Jesse Vig, Marti A. Hearst, Caiming Xiong, Chien-Sheng
Wu
- Abstract summary: Conversational interfaces powered by Large Language Models (LLMs) have recently become a popular way to obtain feedback during document editing.
We present InkSync, an editing interface that suggests executable edits directly within the document being edited.
- Score: 87.84199761550634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational interfaces powered by Large Language Models (LLMs) have
recently become a popular way to obtain feedback during document editing.
However, standard chat-based conversational interfaces do not support
transparency and verifiability of the editing changes that they suggest. To
give the author more agency when editing with an LLM, we present InkSync, an
editing interface that suggests executable edits directly within the document
being edited. Because LLMs are known to introduce factual errors, Inksync also
supports a 3-stage approach to mitigate this risk: Warn authors when a
suggested edit introduces new information, help authors Verify the new
information's accuracy through external search, and allow an auditor to perform
an a-posteriori verification by Auditing the document via a trace of all
auto-generated content. Two usability studies confirm the effectiveness of
InkSync's components when compared to standard LLM-based chat interfaces,
leading to more accurate, more efficient editing, and improved user experience.
Related papers
- FactCheck Editor: Multilingual Text Editor with End-to-End fact-checking [1.985242455423935]
'FactCheck Editor' is an advanced text editor designed to automate fact-checking and correct factual inaccuracies.
It supports over 90 languages and utilizes transformer models to assist humans in the labor-intensive process of fact verification.
arXiv Detail & Related papers (2024-04-30T11:55:20Z) - Editing Conceptual Knowledge for Large Language Models [67.8410749469755]
This paper pioneers the investigation of editing conceptual knowledge for Large Language Models (LLMs)
We construct a novel benchmark dataset ConceptEdit and establish a suite of new metrics for evaluation.
experimental results reveal that, although existing editing methods can efficiently modify concept-level definition to some extent, they also have the potential to distort the related instantial knowledge.
arXiv Detail & Related papers (2024-03-10T16:57:10Z) - GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence [64.95492752484171]
We present GenAudit -- a tool intended to assist fact-checking LLM responses for document-grounded tasks.
We train models to execute these tasks, and design an interactive interface to present suggested edits and evidence to users.
To ensure that most errors are flagged by the system, we propose a method that can increase the error recall while minimizing impact on precision.
arXiv Detail & Related papers (2024-02-19T21:45:55Z) - Learning to Edit: Aligning LLMs with Knowledge Editing [101.96620267293731]
We propose a Learning to Edit (LTE) framework, focusing on teaching large language models to apply updated knowledge into input questions.
LTE features a two-phase process: (i) the Alignment Phase, which fine-tunes LLMs on a meticulously curated parallel dataset to make reliable, in-scope edits.
We demonstrate LTE's superiority in knowledge editing performance, robustness in both batch and sequential editing, minimal interference on general tasks, and rapid editing speeds.
arXiv Detail & Related papers (2024-02-19T07:45:17Z) - Knowledge Editing on Black-box Large Language Models [37.17131278142237]
Knowledge editing aims to efficiently and precisely modify the behavior of large language models (LLMs) to update specific knowledge.
Current research primarily focuses on white-box LLMs editing, overlooking an important scenario: black-box LLMs editing.
We introduce KE on black-box LLMs and then propose a comprehensive evaluation framework to overcome the limitations of existing evaluations.
Experiments and analysis on two benchmarks demonstrate that postEdit outperforms all baselines and achieves strong generalization.
arXiv Detail & Related papers (2024-02-13T17:59:34Z) - SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering [17.20346072074533]
Recent model editing is a promising technique for efficiently updating a small amount of knowledge of large language models (LLMs)
We propose a detachable and expandable Subject Word Embedding Altering (SWEA) framework, which finds the editing embeddings through token-level matching.
We demonstrate the overall state-of-the-art (SOTA) performance of SWEA$oplus$OS on the textscCounterFact and zsRE datasets.
arXiv Detail & Related papers (2024-01-31T13:08:45Z) - Guiding LLM to Fool Itself: Automatically Manipulating Machine Reading
Comprehension Shortcut Triggers [76.77077447576679]
Shortcuts, mechanisms triggered by features spuriously correlated to the true label, has emerged as a potential threat to Machine Reading (MRC) systems.
We introduce a framework that guides an editor to add potential shortcuts-triggers to samples.
Using GPT4 as the editor, we find it can successfully edit trigger shortcut in samples that fool LLMs.
arXiv Detail & Related papers (2023-10-24T12:37:06Z) - XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates [7.660511135287692]
This paper introduces XATU, the first benchmark specifically designed for fine-grained instruction-based explainable text editing.
XATU considers finer-grained text editing tasks of varying difficulty, incorporating lexical, syntactic, semantic, and knowledge-intensive edit aspects.
We demonstrate the effectiveness of instruction tuning and the impact of underlying architecture across various editing tasks.
arXiv Detail & Related papers (2023-09-20T04:58:59Z) - Eva-KELLM: A New Benchmark for Evaluating Knowledge Editing of LLMs [54.22416829200613]
Eva-KELLM is a new benchmark for evaluating knowledge editing of large language models.
Experimental results indicate that the current methods for knowledge editing using raw documents are not effective in yielding satisfactory results.
arXiv Detail & Related papers (2023-08-19T09:17:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.