Implicit Unlikelihood Training: Improving Neural Text Generation with
Reinforcement Learning
- URL: http://arxiv.org/abs/2101.04229v1
- Date: Mon, 11 Jan 2021 23:10:01 GMT
- Title: Implicit Unlikelihood Training: Improving Neural Text Generation with
Reinforcement Learning
- Authors: Evgeny Lagutin and Daniil Gavrilov and Pavel Kalaidin
- Abstract summary: We propose fine-tuning a language model by using reinforcement learning directly optimizing for better generation.
We apply this approach to minimizing repetition in generated text, and show that, when combined with unlikelihood training, our method further reduces repetition without impacting the language model quality.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Likelihood training and maximization-based decoding result in dull and
repetitive generated texts even when using powerful language models (Holtzman
et al., 2019). Adding a loss function for regularization was shown to improve
text generation output by helping avoid unwanted properties, such as
contradiction or repetition (Li at al., 2020). In this work, we propose
fine-tuning a language model by using policy gradient reinforcement learning,
directly optimizing for better generation. We apply this approach to minimizing
repetition in generated text, and show that, when combined with unlikelihood
training (Welleck et al., 2020), our method further reduces repetition without
impacting the language model quality. We also evaluate other methods for
improving generation at training and decoding time, and compare them using
various metrics aimed at control for better text generation output.
Related papers
- Enhancing Text Generation in Joint NLG/NLU Learning Through Curriculum Learning, Semi-Supervised Training, and Advanced Optimization Techniques [0.0]
This research paper developed a novel approach to improve text generation in the context of joint Natural Language Generation (NLG) and Natural Language Understanding (NLU) learning.
The data is prepared by gathering and preprocessing annotated datasets, including cleaning, tokenization, stemming, and stop-word removal.
Transformer-based encoders and decoders, capturing long range dependencies and improving source-target sequence modelling.
Reinforcement learning with policy gradient techniques, semi-supervised training, improved attention mechanisms, and differentiable approximations are employed to fine-tune the models and handle complex linguistic tasks effectively.
arXiv Detail & Related papers (2024-10-17T12:43:49Z) - Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents.
Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z) - Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding [75.06872859716049]
Large Language Models (LLMs) have demonstrated a powerful ability for text generation.
undesired behaviors such as toxicity or hallucinations can manifest.
We propose formalizing text generation as a future-constrained generation problem.
arXiv Detail & Related papers (2023-12-11T06:35:33Z) - Click: Controllable Text Generation with Sequence Likelihood Contrastive
Learning [69.35360098882606]
We introduce Click for controllable text generation, which needs no modification to the model architecture.
It employs a contrastive loss on sequence likelihood, which fundamentally decreases the generation probability of negative samples.
It also adopts a novel likelihood ranking-based strategy to construct contrastive samples from model generations.
arXiv Detail & Related papers (2023-06-06T01:56:44Z) - Language Model Evaluation in Open-ended Text Generation [0.76146285961466]
We study different evaluation metrics that have been proposed to evaluate quality, diversity and consistency of machine-generated text.
From there, we propose a practical pipeline to evaluate language models in open-ended generation task.
arXiv Detail & Related papers (2021-08-08T06:16:02Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Data Augmentation in Natural Language Processing: A Novel Text
Generation Approach for Long and Short Text Classifiers [8.19984844136462]
We present and evaluate a text generation method suitable to increase the performance of classifiers for long and short texts.
In a simulated low data regime additive accuracy gains of up to 15.53% are achieved.
We discuss implications and patterns for the successful application of our approach on different types of datasets.
arXiv Detail & Related papers (2021-03-26T13:16:07Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - Improving Adversarial Text Generation by Modeling the Distant Future [155.83051741029732]
We consider a text planning scheme and present a model-based imitation-learning approach to alleviate the aforementioned issues.
We propose a novel guider network to focus on the generative process over a longer horizon, which can assist next-word prediction and provide intermediate rewards for generator optimization.
arXiv Detail & Related papers (2020-05-04T05:45:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.