Improving Cross-Domain Low-Resource Text Generation through LLM
Post-Editing: A Programmer-Interpreter Approach
- URL: http://arxiv.org/abs/2402.04609v1
- Date: Wed, 7 Feb 2024 06:13:14 GMT
- Title: Improving Cross-Domain Low-Resource Text Generation through LLM
Post-Editing: A Programmer-Interpreter Approach
- Authors: Zhuang Li, Levon Haroutunian, Raj Tumuluri, Philip Cohen, Gholamreza
Haffari
- Abstract summary: Post-editing has proven effective in improving the quality of text generated by large language models (LLMs)
We propose a neural programmer-interpreter approach that preserves the domain generalization ability of LLMs when editing their output.
Experiments demonstrate that the programmer-interpreter significantly enhances GPT-3.5's performance in logical form-to-text conversion and low-resource machine translation.
- Score: 50.400999859808984
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Post-editing has proven effective in improving the quality of text generated
by large language models (LLMs) such as GPT-3.5 or GPT-4, particularly when
direct updating of their parameters to enhance text quality is infeasible or
expensive. However, relying solely on smaller language models for post-editing
can limit the LLMs' ability to generalize across domains. Moreover, the editing
strategies in these methods are not optimally designed for text-generation
tasks. To address these limitations, we propose a neural programmer-interpreter
approach that preserves the domain generalization ability of LLMs when editing
their output. The editing actions in this framework are specifically devised
for text generation. Extensive experiments demonstrate that the
programmer-interpreter significantly enhances GPT-3.5's performance in logical
form-to-text conversion and low-resource machine translation, surpassing other
state-of-the-art (SOTA) LLM post-editing methods in cross-domain settings.
Related papers
- One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)
We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages.
Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs.
In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z) - LLM can Achieve Self-Regulation via Hyperparameter Aware Generation [88.69052513433603]
Large Language Models (LLMs) employ diverse decoding strategies to control the generated text.
Are LLMs conscious of the existence of these decoding strategies and capable of regulating themselves?
We propose a novel text generation paradigm termed Hyperparameter Aware Generation (HAG)
arXiv Detail & Related papers (2024-02-17T11:18:22Z) - Large Language Models for the Automated Analysis of Optimization
Algorithms [0.9668407688201361]
We aim to demonstrate the potential of Large Language Models (LLMs) within the realm of optimization algorithms by integrating them into STNWeb.
This is a web-based tool for the generation of Search Trajectory Networks (STNs), which are visualizations of optimization algorithm behavior.
arXiv Detail & Related papers (2024-02-13T14:05:02Z) - Harnessing the Plug-and-Play Controller by Prompting [12.705251690623495]
This paper introduces a novel method for flexible attribute control in text generation using pre-trained language models (PLMs)
The proposed approach aims to enhance the fluency of generated text by guiding the generation process with PPCs.
arXiv Detail & Related papers (2024-02-06T17:18:25Z) - Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks.
We show that they have yet to attain state-of-the-art performance in Neural Machine Translation.
We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z) - CLIP-Guided StyleGAN Inversion for Text-Driven Real Image Editing [22.40686064568406]
We present CLIPInverter, a new text-driven image editing approach that is able to efficiently and reliably perform multi-attribute changes.
Our method outperforms competing approaches in terms of manipulation accuracy and photo-realism on various domains including human faces, cats, and birds.
arXiv Detail & Related papers (2023-07-17T11:29:48Z) - Reducing Sequence Length by Predicting Edit Operations with Large
Language Models [50.66922361766939]
This paper proposes predicting edit spans for the source text for local sequence transduction tasks.
We apply instruction tuning for Large Language Models on the supervision data of edit spans.
Experiments show that the proposed method achieves comparable performance to the baseline in four tasks.
arXiv Detail & Related papers (2023-05-19T17:51:05Z) - Progressive Generation of Long Text with Pretrained Language Models [83.62523163717448]
Large-scale language models (LMs) pretrained on massive corpora of text, such as GPT-2, are powerful open-domain text generators.
It is still challenging for such models to generate coherent long passages of text, especially when the models are fine-tuned to the target domain on a small corpus.
We propose a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution.
arXiv Detail & Related papers (2020-06-28T21:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.