Improving Cross-Domain Low-Resource Text Generation through LLM
Post-Editing: A Programmer-Interpreter Approach
- URL: http://arxiv.org/abs/2402.04609v1
- Date: Wed, 7 Feb 2024 06:13:14 GMT
- Title: Improving Cross-Domain Low-Resource Text Generation through LLM
Post-Editing: A Programmer-Interpreter Approach
- Authors: Zhuang Li, Levon Haroutunian, Raj Tumuluri, Philip Cohen, Gholamreza
Haffari
- Abstract summary: Post-editing has proven effective in improving the quality of text generated by large language models (LLMs)
We propose a neural programmer-interpreter approach that preserves the domain generalization ability of LLMs when editing their output.
Experiments demonstrate that the programmer-interpreter significantly enhances GPT-3.5's performance in logical form-to-text conversion and low-resource machine translation.
- Score: 50.400999859808984
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Post-editing has proven effective in improving the quality of text generated
by large language models (LLMs) such as GPT-3.5 or GPT-4, particularly when
direct updating of their parameters to enhance text quality is infeasible or
expensive. However, relying solely on smaller language models for post-editing
can limit the LLMs' ability to generalize across domains. Moreover, the editing
strategies in these methods are not optimally designed for text-generation
tasks. To address these limitations, we propose a neural programmer-interpreter
approach that preserves the domain generalization ability of LLMs when editing
their output. The editing actions in this framework are specifically devised
for text generation. Extensive experiments demonstrate that the
programmer-interpreter significantly enhances GPT-3.5's performance in logical
form-to-text conversion and low-resource machine translation, surpassing other
state-of-the-art (SOTA) LLM post-editing methods in cross-domain settings.
Related papers
- UltraGen: Extremely Fine-grained Controllable Generation via Attribute Reconstruction and Global Preference Optimization [33.747872934103334]
existing methods focus mainly on a small set of attributes like 3 to 5, and their degrades significantly when the number of attributes increases to magnitude.
We propose a novel zero-shot approach for extremely finegrained controllable generation (EFCG)
Our framework significantly improves the constraint satisfaction rate (CSR) and text quality for EFCG by mitigating bias and alleviating attention dilution.
arXiv Detail & Related papers (2025-02-17T23:28:58Z) - LLM Program Optimization via Retrieval Augmented Search [71.40092732256252]
We propose a blackbox adaptation method called Retrieval Augmented Search (RAS) that performs beam search over candidate optimizations.
We show that RAS performs 1.8$times$ better than prior state-of-the-art blackbox adaptation strategies.
We also propose a method called AEGIS for improving interpretability by decomposing training examples into "atomic edits"
arXiv Detail & Related papers (2025-01-31T06:34:47Z) - Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance [2.1792283995628465]
Existing edit distance metrics, such as Levenshtein, BLEU, ROUGE, and TER, often fail to accurately measure the effort required for post-editing.
We introduce a novel compression-based edit distance metric grounded in the Lempel-Ziv-77 algorithm.
arXiv Detail & Related papers (2024-12-23T06:29:25Z) - Effective Text Adaptation for LLM-based ASR through Soft Prompt Fine-Tuning [12.676026149146772]
Large Language Models (LLM) has reformed the Automatic Speech Recognition (ASR)
Fine-tuning such ASR on text-only data without paired prompts may diminish the effectiveness of domain-specific knowledge.
We propose a two-step soft prompt fine-tuning strategy that enhances domain-specific text adaptation.
arXiv Detail & Related papers (2024-12-09T20:22:06Z) - Unveiling Large Language Models Generated Texts: A Multi-Level Fine-Grained Detection Framework [9.976099891796784]
Large language models (LLMs) have transformed human writing by enhancing grammar correction, content expansion, and stylistic refinement.
Existing detection methods, which mainly rely on single-feature analysis and binary classification, often fail to effectively identify LLM-generated text in academic contexts.
We propose a novel Multi-level Fine-grained Detection framework that detects LLM-generated text by integrating low-level structural, high-level semantic, and deep-level linguistic features.
arXiv Detail & Related papers (2024-10-18T07:25:00Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)
We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - LLM can Achieve Self-Regulation via Hyperparameter Aware Generation [88.69052513433603]
Large Language Models (LLMs) employ diverse decoding strategies to control the generated text.
Are LLMs conscious of the existence of these decoding strategies and capable of regulating themselves?
We propose a novel text generation paradigm termed Hyperparameter Aware Generation (HAG)
arXiv Detail & Related papers (2024-02-17T11:18:22Z) - Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks.
We show that they have yet to attain state-of-the-art performance in Neural Machine Translation.
We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z) - Reducing Sequence Length by Predicting Edit Operations with Large
Language Models [50.66922361766939]
This paper proposes predicting edit spans for the source text for local sequence transduction tasks.
We apply instruction tuning for Large Language Models on the supervision data of edit spans.
Experiments show that the proposed method achieves comparable performance to the baseline in four tasks.
arXiv Detail & Related papers (2023-05-19T17:51:05Z) - Progressive Generation of Long Text with Pretrained Language Models [83.62523163717448]
Large-scale language models (LMs) pretrained on massive corpora of text, such as GPT-2, are powerful open-domain text generators.
It is still challenging for such models to generate coherent long passages of text, especially when the models are fine-tuned to the target domain on a small corpus.
We propose a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution.
arXiv Detail & Related papers (2020-06-28T21:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.