Improving Cross-Domain Low-Resource Text Generation through LLM
Post-Editing: A Programmer-Interpreter Approach
- URL: http://arxiv.org/abs/2402.04609v1
- Date: Wed, 7 Feb 2024 06:13:14 GMT
- Title: Improving Cross-Domain Low-Resource Text Generation through LLM
Post-Editing: A Programmer-Interpreter Approach
- Authors: Zhuang Li, Levon Haroutunian, Raj Tumuluri, Philip Cohen, Gholamreza
Haffari
- Abstract summary: Post-editing has proven effective in improving the quality of text generated by large language models (LLMs)
We propose a neural programmer-interpreter approach that preserves the domain generalization ability of LLMs when editing their output.
Experiments demonstrate that the programmer-interpreter significantly enhances GPT-3.5's performance in logical form-to-text conversion and low-resource machine translation.
- Score: 50.400999859808984
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Post-editing has proven effective in improving the quality of text generated
by large language models (LLMs) such as GPT-3.5 or GPT-4, particularly when
direct updating of their parameters to enhance text quality is infeasible or
expensive. However, relying solely on smaller language models for post-editing
can limit the LLMs' ability to generalize across domains. Moreover, the editing
strategies in these methods are not optimally designed for text-generation
tasks. To address these limitations, we propose a neural programmer-interpreter
approach that preserves the domain generalization ability of LLMs when editing
their output. The editing actions in this framework are specifically devised
for text generation. Extensive experiments demonstrate that the
programmer-interpreter significantly enhances GPT-3.5's performance in logical
form-to-text conversion and low-resource machine translation, surpassing other
state-of-the-art (SOTA) LLM post-editing methods in cross-domain settings.
Related papers
- Unveiling Large Language Models Generated Texts: A Multi-Level Fine-Grained Detection Framework [9.976099891796784]
Large language models (LLMs) have transformed human writing by enhancing grammar correction, content expansion, and stylistic refinement.
Existing detection methods, which mainly rely on single-feature analysis and binary classification, often fail to effectively identify LLM-generated text in academic contexts.
We propose a novel Multi-level Fine-grained Detection framework that detects LLM-generated text by integrating low-level structural, high-level semantic, and deep-level linguistic features.
arXiv Detail & Related papers (2024-10-18T07:25:00Z) - Can AI writing be salvaged? Mitigating Idiosyncrasies and Improving Human-AI Alignment in the Writing Process through Edits [39.00434175773803]
We hired professional writers to edit paragraphs in several creative domains.
We curated the LAMP corpus: 1,057 LLM-generated paragraphs edited by professional writers according to our taxonomy.
Analysis of LAMP reveals that none of the LLMs used in our study outperform each other in terms of writing quality.
arXiv Detail & Related papers (2024-09-22T16:13:00Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)
We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages.
Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs.
In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z) - LLM can Achieve Self-Regulation via Hyperparameter Aware Generation [88.69052513433603]
Large Language Models (LLMs) employ diverse decoding strategies to control the generated text.
Are LLMs conscious of the existence of these decoding strategies and capable of regulating themselves?
We propose a novel text generation paradigm termed Hyperparameter Aware Generation (HAG)
arXiv Detail & Related papers (2024-02-17T11:18:22Z) - Harnessing the Plug-and-Play Controller by Prompting [12.705251690623495]
This paper introduces a novel method for flexible attribute control in text generation using pre-trained language models (PLMs)
The proposed approach aims to enhance the fluency of generated text by guiding the generation process with PPCs.
arXiv Detail & Related papers (2024-02-06T17:18:25Z) - Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks.
We show that they have yet to attain state-of-the-art performance in Neural Machine Translation.
We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z) - Reducing Sequence Length by Predicting Edit Operations with Large
Language Models [50.66922361766939]
This paper proposes predicting edit spans for the source text for local sequence transduction tasks.
We apply instruction tuning for Large Language Models on the supervision data of edit spans.
Experiments show that the proposed method achieves comparable performance to the baseline in four tasks.
arXiv Detail & Related papers (2023-05-19T17:51:05Z) - Progressive Generation of Long Text with Pretrained Language Models [83.62523163717448]
Large-scale language models (LMs) pretrained on massive corpora of text, such as GPT-2, are powerful open-domain text generators.
It is still challenging for such models to generate coherent long passages of text, especially when the models are fine-tuned to the target domain on a small corpus.
We propose a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution.
arXiv Detail & Related papers (2020-06-28T21:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.