Selective Token Generation for Few-shot Natural Language Generation
- URL: http://arxiv.org/abs/2209.08206v1
- Date: Sat, 17 Sep 2022 00:48:52 GMT
- Title: Selective Token Generation for Few-shot Natural Language Generation
- Authors: Daejin Jo, Taehwan Kwon, Eun-Sol Kim, Sungwoong Kim
- Abstract summary: We develop a novel additive learning algorithm based on reinforcement learning (RL)
We show that the proposed selective token generation significantly outperforms the previous additive learning algorithms based on the PLMs.
- Score: 19.015739016376532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural language modeling with limited training data is a challenging
problem, and many algorithms make use of large-scale pretrained language models
(PLMs) for this due to its great generalization ability. Among them, additive
learning that incorporates a task-specific adapter on top of the fixed
large-scale PLM has been popularly used in the few-shot setting. However, this
added adapter is still easy to disregard the knowledge of the PLM especially
for few-shot natural language generation (NLG) since an entire sequence is
usually generated by only the newly trained adapter. Therefore, in this work,
we develop a novel additive learning algorithm based on reinforcement learning
(RL) that selectively outputs language tokens between the task-general PLM and
the task-specific adapter during both training and inference. This output token
selection over the two generators allows the adapter to take into account
solely the task-relevant parts in sequence generation, and therefore makes it
more robust to overfitting as well as more stable in RL training. In addition,
to obtain the complementary adapter from the PLM for each few-shot task, we
exploit a separate selecting module that is also simultaneously trained using
RL. Experimental results on various few-shot NLG tasks including question
answering, data-to-text generation and text summarization demonstrate that the
proposed selective token generation significantly outperforms the previous
additive learning algorithms based on the PLMs.
Related papers
- Language Models can Self-Lengthen to Generate Long Texts [74.96074422345806]
This paper introduces an innovative iterative training framework called Self-Lengthen.
It leverages only the intrinsic knowledge and skills of Large Language Models without the need for auxiliary data or proprietary models.
Experiments on benchmarks and human evaluations show that Self-Lengthen outperforms existing methods in long-text generation.
arXiv Detail & Related papers (2024-10-31T13:47:10Z) - Prompt Optimization via Adversarial In-Context Learning [51.18075178593142]
adv-ICL is implemented as a two-player game between a generator and a discriminator.
The generator tries to generate realistic enough output to fool the discriminator.
We show that adv-ICL results in significant improvements over state-of-the-art prompt optimization techniques.
arXiv Detail & Related papers (2023-12-05T09:44:45Z) - Instructed Language Models with Retrievers Are Powerful Entity Linkers [87.16283281290053]
Instructed Generative Entity Linker (INSGENEL) is the first approach that enables casual language models to perform entity linking over knowledge bases.
INSGENEL outperforms previous generative alternatives with +6.8 F1 points gain on average.
arXiv Detail & Related papers (2023-11-06T16:38:51Z) - Graph Neural Prompting with Large Language Models [32.97391910476073]
Graph Neural Prompting (GNP) is a novel plug-and-play method to assist pre-trained language models in learning beneficial knowledge from knowledge graphs.
Extensive experiments on multiple datasets demonstrate the superiority of GNP on both commonsense and biomedical reasoning tasks.
arXiv Detail & Related papers (2023-09-27T06:33:29Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - Benchmarking Large Language Model Capabilities for Conditional
Generation [15.437176676169997]
We discuss how to adapt existing application-specific generation benchmarks to PLMs.
We show that PLMs differ in their applicability to different data regimes and their generalization to multiple languages.
arXiv Detail & Related papers (2023-06-29T08:59:40Z) - Preference-grounded Token-level Guidance for Language Model Fine-tuning [105.88789610320426]
Aligning language models with preferences is an important problem in natural language generation.
For LM training, based on the amount of supervised data, we present two *minimalist* learning objectives that utilize the learned guidance.
In experiments, our method performs competitively on two distinct representative LM tasks.
arXiv Detail & Related papers (2023-06-01T07:00:07Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Zero-shot Learning by Generating Task-specific Adapters [38.452434222367515]
We introduce Hypter, a framework that improves zero-shot transferability by training a hypernetwork to generate task-specific adapters from task descriptions.
This formulation enables learning at task level, and greatly reduces the number of parameters by using light-weight adapters.
arXiv Detail & Related papers (2021-01-02T10:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.