FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale
Generation
- URL: http://arxiv.org/abs/2012.15482v1
- Date: Thu, 31 Dec 2020 07:22:15 GMT
- Title: FiD-Ex: Improving Sequence-to-Sequence Models for Extractive Rationale
Generation
- Authors: Kushal Lakhotia, Bhargavi Paranjape, Asish Ghoshal, Wen-tau Yih,
Yashar Mehdad, Srinivasan Iyer
- Abstract summary: We develop FiD-Ex, which addresses shortcomings for seq2seq models by introducing sentence markers to eliminate explanation fabrication.
FiD-Ex significantly improves over prior work in terms of explanation metrics and task accuracy, on multiple tasks from the ERASER explainability benchmark.
- Score: 19.73842483996047
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural language (NL) explanations of model predictions are gaining
popularity as a means to understand and verify decisions made by large
black-box pre-trained models, for NLP tasks such as Question Answering (QA) and
Fact Verification. Recently, pre-trained sequence to sequence (seq2seq) models
have proven to be very effective in jointly making predictions, as well as
generating NL explanations. However, these models have many shortcomings; they
can fabricate explanations even for incorrect predictions, they are difficult
to adapt to long input documents, and their training requires a large amount of
labeled data. In this paper, we develop FiD-Ex, which addresses these
shortcomings for seq2seq models by: 1) introducing sentence markers to
eliminate explanation fabrication by encouraging extractive generation, 2)
using the fusion-in-decoder architecture to handle long input contexts, and 3)
intermediate fine-tuning on re-structured open domain QA datasets to improve
few-shot performance. FiD-Ex significantly improves over prior work in terms of
explanation metrics and task accuracy, on multiple tasks from the ERASER
explainability benchmark, both in the fully supervised and in the few-shot
settings.
Related papers
- MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering [64.6741991162092]
We present MinPrompt, a minimal data augmentation framework for open-domain question answering.
We transform the raw text into a graph structure to build connections between different factual sentences.
We then apply graph algorithms to identify the minimal set of sentences needed to cover the most information in the raw text.
We generate QA pairs based on the identified sentence subset and train the model on the selected sentences to obtain the final model.
arXiv Detail & Related papers (2023-10-08T04:44:36Z) - Tokenization Consistency Matters for Generative Models on Extractive NLP
Tasks [54.306234256074255]
We identify the issue of tokenization inconsistency that is commonly neglected in training generative models.
This issue damages the extractive nature of these tasks after the input and output are tokenized inconsistently.
We show that, with consistent tokenization, the model performs better in both in-domain and out-of-domain datasets.
arXiv Detail & Related papers (2022-12-19T23:33:21Z) - Confident Adaptive Language Modeling [95.45272377648773]
CALM is a framework for dynamically allocating different amounts of compute per input and generation timestep.
We demonstrate the efficacy of our framework in reducing compute -- potential speedup of up to $times 3$ -- while provably maintaining high performance.
arXiv Detail & Related papers (2022-07-14T17:00:19Z) - VisFIS: Visual Feature Importance Supervision with
Right-for-the-Right-Reason Objectives [84.48039784446166]
We show that model FI supervision can meaningfully improve VQA model accuracy as well as performance on several Right-for-the-Right-Reason metrics.
Our best performing method, Visual Feature Importance Supervision (VisFIS), outperforms strong baselines on benchmark VQA datasets.
Predictions are more accurate when explanations are plausible and faithful, and not when they are plausible but not faithful.
arXiv Detail & Related papers (2022-06-22T17:02:01Z) - Complex Event Forecasting with Prediction Suffix Trees: Extended
Technical Report [70.7321040534471]
Complex Event Recognition (CER) systems have become popular in the past two decades due to their ability to "instantly" detect patterns on real-time streams of events.
There is a lack of methods for forecasting when a pattern might occur before such an occurrence is actually detected by a CER engine.
We present a formal framework that attempts to address the issue of Complex Event Forecasting.
arXiv Detail & Related papers (2021-09-01T09:52:31Z) - FairCanary: Rapid Continuous Explainable Fairness [8.362098382773265]
We present Quantile Demographic Drift (QDD), a novel model bias quantification metric.
QDD is ideal for continuous monitoring scenarios, does not suffer from the statistical limitations of conventional threshold-based bias metrics.
We incorporate QDD into a continuous model monitoring system, called FairCanary, that reuses existing explanations computed for each individual prediction.
arXiv Detail & Related papers (2021-06-13T17:47:44Z) - Masked Language Modeling and the Distributional Hypothesis: Order Word
Matters Pre-training for Little [74.49773960145681]
A possible explanation for the impressive performance of masked language model (MLM)-training is that such models have learned to represent the syntactic structures prevalent in NLP pipelines.
In this paper, we propose a different explanation: pre-trains succeed on downstream tasks almost entirely due to their ability to model higher-order word co-occurrence statistics.
Our results show that purely distributional information largely explains the success of pre-training, and underscore the importance of curating challenging evaluation datasets that require deeper linguistic knowledge.
arXiv Detail & Related papers (2021-04-14T06:30:36Z) - Explain and Predict, and then Predict Again [6.865156063241553]
We propose ExPred, that uses multi-task learning in the explanation generation phase effectively trading-off explanation and prediction losses.
We conduct an extensive evaluation of our approach on three diverse language datasets.
arXiv Detail & Related papers (2021-01-11T19:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.