Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
- URL: http://arxiv.org/abs/2112.08688v1
- Date: Thu, 16 Dec 2021 08:18:47 GMT
- Title: Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
- Authors: Akari Asai, Matt Gardner, Hannaneh Hajishirzi
- Abstract summary: Retrieval-augmented generation models have shown state-of-the-art performance across many knowledge-intensive NLP tasks.
We introduce a method to incorporate evidentiality of passages -- whether a passage contains correct evidence to support the output -- into training the generator.
- Score: 59.761411682238645
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval-augmented generation models have shown state-of-the-art performance
across many knowledge-intensive NLP tasks such as open question answering and
fact verification. These models are trained to generate the final output given
the retrieved passages, which can be irrelevant to the original query, leading
to learning spurious cues or answer memorization. This work introduces a method
to incorporate evidentiality of passages -- whether a passage contains correct
evidence to support the output -- into training the generator. We introduce a
multi-task learning framework to jointly generate the final output and predict
the evidentiality of each passage, leveraging a new task-agnostic method to
obtain {\it silver} evidentiality labels for supervision. Our experiments on
five datasets across three knowledge-intensive tasks show that our new
evidentiality-guided generator significantly outperforms its direct counterpart
with the same-size model and advances the state of the art on FaVIQ-Ambig. We
attribute these improvements to both the auxiliary multi-task learning and
silver evidentiality mining techniques.
Related papers
- Multi-Stage Knowledge Integration of Vision-Language Models for Continual Learning [79.46570165281084]
We propose a Multi-Stage Knowledge Integration network (MulKI) to emulate the human learning process in distillation methods.
MulKI achieves this through four stages, including Eliciting Ideas, Adding New Ideas, Distinguishing Ideas, and Making Connections.
Our method demonstrates significant improvements in maintaining zero-shot capabilities while supporting continual learning across diverse downstream tasks.
arXiv Detail & Related papers (2024-11-11T07:36:19Z) - Enhancing Retrieval-Augmented LMs with a Two-stage Consistency Learning Compressor [4.35807211471107]
This work proposes a novel two-stage consistency learning approach for retrieved information compression in retrieval-augmented language models.
The proposed method is empirically validated across multiple datasets, demonstrating notable enhancements in precision and efficiency for question-answering tasks.
arXiv Detail & Related papers (2024-06-04T12:43:23Z) - Evolving Knowledge Distillation with Large Language Models and Active
Learning [46.85430680828938]
Large language models (LLMs) have demonstrated remarkable capabilities across various NLP tasks.
Previous research has attempted to distill the knowledge of LLMs into smaller models by generating annotated data.
We propose EvoKD: Evolving Knowledge Distillation, which leverages the concept of active learning to interactively enhance the process of data generation using large language models.
arXiv Detail & Related papers (2024-03-11T03:55:24Z) - Self-Convinced Prompting: Few-Shot Question Answering with Repeated
Introspection [13.608076739368949]
We introduce a novel framework that harnesses the potential of large-scale pre-trained language models.
Our framework processes the output of a typical few-shot chain-of-thought prompt, assesses the correctness of the response, scrutinizes the answer, and ultimately produces a new solution.
arXiv Detail & Related papers (2023-10-08T06:36:26Z) - Composite Learning for Robust and Effective Dense Predictions [81.2055761433725]
Multi-task learning promises better model generalization on a target task by jointly optimizing it with an auxiliary task.
We find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.
arXiv Detail & Related papers (2022-10-13T17:59:16Z) - Recitation-Augmented Language Models [85.30591349383849]
We show that RECITE is a powerful paradigm for knowledge-intensive NLP tasks.
Specifically, we show that by utilizing recitation as the intermediate step, a recite-and-answer scheme can achieve new state-of-the-art performance.
arXiv Detail & Related papers (2022-10-04T00:49:20Z) - Task Formulation Matters When Learning Continually: A Case Study in
Visual Question Answering [58.82325933356066]
Continual learning aims to train a model incrementally on a sequence of tasks without forgetting previous knowledge.
We present a detailed study of how different settings affect performance for Visual Question Answering.
arXiv Detail & Related papers (2022-09-30T19:12:58Z) - Multi-Task Retrieval-Augmented Text Generation with Relevance Sampling [19.17759446168802]
We study multi-task training of retrieval-augmented generation models for knowledge-intensive tasks.
We filter training examples via a threshold of confidence on the relevance labels, whether a pair is answerable by the knowledge base or not.
arXiv Detail & Related papers (2022-07-07T00:57:02Z) - Reinforcement Learning with Prototypical Representations [114.35801511501639]
Proto-RL is a self-supervised framework that ties representation learning with exploration through prototypical representations.
These prototypes simultaneously serve as a summarization of the exploratory experience of an agent as well as a basis for representing observations.
This enables state-of-the-art downstream policy learning on a set of difficult continuous control tasks.
arXiv Detail & Related papers (2021-02-22T18:56:34Z) - Learning Invariant Representation for Continual Learning [5.979373021392084]
A key challenge in Continual learning is catastrophically forgetting previously learned tasks when the agent faces a new one.
We propose a new pseudo-rehearsal-based method, named learning Invariant Representation for Continual Learning (IRCL)
Disentangling the shared invariant representation helps to learn continually a sequence of tasks, while being more robust to forgetting and having better knowledge transfer.
arXiv Detail & Related papers (2021-01-15T15:12:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.