Multi-Task Retrieval-Augmented Text Generation with Relevance Sampling
- URL: http://arxiv.org/abs/2207.03030v1
- Date: Thu, 7 Jul 2022 00:57:02 GMT
- Title: Multi-Task Retrieval-Augmented Text Generation with Relevance Sampling
- Authors: Sebastian Hofst\"atter, Jiecao Chen, Karthik Raman, Hamed Zamani
- Abstract summary: We study multi-task training of retrieval-augmented generation models for knowledge-intensive tasks.
We filter training examples via a threshold of confidence on the relevance labels, whether a pair is answerable by the knowledge base or not.
- Score: 19.17759446168802
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies multi-task training of retrieval-augmented generation
models for knowledge-intensive tasks. We propose to clean the training set by
utilizing a distinct property of knowledge-intensive generation: The connection
of query-answer pairs to items in the knowledge base. We filter training
examples via a threshold of confidence on the relevance labels, whether a pair
is answerable by the knowledge base or not. We train a single Fusion-in-Decoder
(FiD) generator on seven combined tasks of the KILT benchmark. The experimental
results suggest that our simple yet effective approach substantially improves
competitive baselines on two strongly imbalanced tasks; and shows either
smaller improvements or no significant regression on the remaining tasks.
Furthermore, we demonstrate our multi-task training with relevance label
sampling scales well with increased model capacity and achieves
state-of-the-art results in five out of seven KILT tasks.
Related papers
- The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback [12.388205905012423]
Reinforcement learning from human feedback has contributed to performance improvements in large language models.
We formulate RLHF as the contextual dueling bandit problem and assume a common linear representation.
We prove that to achieve $varepsilon-$optimal, the sample complexity of the source tasks can be significantly reduced.
arXiv Detail & Related papers (2024-05-18T08:29:15Z) - Retrieval-Generation Synergy Augmented Large Language Models [30.53260173572783]
We propose an iterative retrieval-generation collaborative framework.
We conduct experiments on four question answering datasets, including single-hop QA and multi-hop QA tasks.
arXiv Detail & Related papers (2023-10-08T12:50:57Z) - Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z) - Composite Learning for Robust and Effective Dense Predictions [81.2055761433725]
Multi-task learning promises better model generalization on a target task by jointly optimizing it with an auxiliary task.
We find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.
arXiv Detail & Related papers (2022-10-13T17:59:16Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Combining Modular Skills in Multitask Learning [149.8001096811708]
A modular design encourages neural models to disentangle and recombine different facets of knowledge to generalise more systematically to new tasks.
In this work, we assume each task is associated with a subset of latent discrete skills from a (potentially small) inventory.
We find that the modular design of a network significantly increases sample efficiency in reinforcement learning and few-shot generalisation in supervised learning.
arXiv Detail & Related papers (2022-02-28T16:07:19Z) - Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks [59.761411682238645]
Retrieval-augmented generation models have shown state-of-the-art performance across many knowledge-intensive NLP tasks.
We introduce a method to incorporate evidentiality of passages -- whether a passage contains correct evidence to support the output -- into training the generator.
arXiv Detail & Related papers (2021-12-16T08:18:47Z) - STraTA: Self-Training with Task Augmentation for Better Few-shot
Learning [77.04780470527432]
We propose STraTA, which stands for Self-Training with Task Augmentation.
Our experiments demonstrate that STraTA can substantially improve sample efficiency across 12 few-shot benchmarks.
Our analyses reveal that task augmentation and self-training are both complementary and independently effective.
arXiv Detail & Related papers (2021-09-13T19:14:01Z) - Understanding and Improving Information Transfer in Multi-Task Learning [14.43111978531182]
We study an architecture with a shared module for all tasks and a separate output module for each task.
We show that misalignment between task data can cause negative transfer (or hurt performance) and provide sufficient conditions for positive transfer.
Inspired by the theoretical insights, we show that aligning tasks' embedding layers leads to performance gains for multi-task training and transfer learning.
arXiv Detail & Related papers (2020-05-02T23:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.